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Multivalent Antigen-Binding Proteins 

This invention was made with Government Support under SBIR Grant 
5R44 GM 39662-03 awarded by die National Institutes of Health, National 
Institute of General Medical Sciences. The Government has certain rights in 
the invention. 

Cross-Reference to Related Applications 

This application is a continuation-in-part of U.S. Patent Application 
Serial Number 07/796,936, filed Nov. 25, 1991, which is a continuation-in- 
part of U.S. Patent Application Serial No. 07/512,910 filed April 25, 1990, 
which is a continuation-in-part of Serial No. 07/299,617, filed Jan. 1, 1989, 
issued as U.S. Patent No. 4,946,778 (Ladner et al,), which was a 
continuation-in-part of Serial No. 092,110, filed Sept. 2, 1987, and Serial No. 
902,971, filed Sept. 2, 1986, now abandoned, the contents of all of which are 
fully incorporated herein by reference. 

Background of the Invention 

J, Field of the Invention 

The present invention relates generally to the production of antigen- 
binding molecules. More specifically, the invention relates to multivalent 
forms of antigen-binding proteins. Compositions of, genetic constructions for, 
methods of use, and methods for producing these multivalent antigen-binding 
proteins are disclosed. 
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2. Description of the Background Art 

Antibodies are proteins generated by the immune system to provide a 
specific molecule capable of completing with an invading molecule, termed 
an antigen. Figure 14 shows the structure of a typical antibody molecule. 
Natural antibodies have two identical antigen-binding sites, both of which are 
specific to a particular antigen. Hie antibody molecule "recognizes" the 
antigen by complexing its andgen-binding sites with areas of the antigen 
termed epitopes. The epitt>pes fit into the conformational architecture of the 
antigen-binding sites of the antibody, enabling the antibody to bind to the 
antigen. 

The anribody molecule is composed of two identical heavy and two 
identical light polypeptide chains, held together by interchain disulfide bonds 
(sec Fig. 14). The remainder of this discussion will refer only to one 
light/heavy pair of chains, as each light/heavy pair is identical. Each 
individual light and heavy chain folds into regions of approximately 110 amino 
adds, assuming a conserved three-dimensional conformation. The light chain 
comprises one variable region (termed Vl) and one constant region (CJ, while 
the heavy chain comprises one variable region (Vh) and three constant regions 
(C|| 1 , Ch2 and 0^3). Pairs of regions associate to form discrete structures as 
shown in Figure 14. In particular, the light and heavy chain variable regions, 
and Vii.associate to form an "Fy" area which contains the antigen-binding 
site. 

The variable regions of both heavy and light chains show considerable 
variability in structure and amino acid composition firom one antibody 
molecule to another, whereas the constant regions show little variability. The 
term "variable" as used in this specification refers to the diverse nature of the 
amino acid sequences of the antibody heavy and light chain variable regions. 
Each antibody recognizes and binds antigen through the binding site defined 
by the association of the heavy and light chain variable regions into an Fy 
area. The light-chain variable region the heavy-chain variable region 

V„ of a particular antibody molecule have specific amino acid sequences that 
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allow the antigen-binding site to assume a conformation that binds to the 
antigen epitope recognized by that particular antibody. 

Within the variable regions are found regions in which the amino acid 
sequence is extremely variable from one antibody to another. Three of these 
so-called "hypervariable" regions or "complementarity-determining regions" 
(CDR*s) are found in each of the light and heavy chains. The three CDR's 
from a light chain and the three CDR's from a corresponding heavy chain 
form the antigen-binding site. 

Cleavage of the naturally-occurring antibody molecule with the 
proteolytic enzyme papain generates fragments which retain their antigen- 
binding site. These fragments, commonly known as Fab's (for Fragment, 
antigen binding site) are composed of the C^, Vl, C„1 and Vh regions of the 
antibody. In the Fab the light chain and the fragment of the heavy chain are 
covalently linked by a disulfide linkage. 

Recent advances in immunobiology, recombinant DNA technology, and 
computer science have allowed the creation of single polypeptide chain 
molecules that bind antigen. These single-chain antigen-binding molecules 
incorporate a linker polypeptide to bridge the individual variable regions, 
and Vh, into a single polypeptide chain. A computer-assisted method for 
linker design is described more particularly in U.S. Patent No. 4,704,692, 
issued to Ladner et aL in November, 1987, and incorporated herein by 
reference. A description of the theory and production of single-chain antigen- 
binding proteins is found in U.S. Patent No. 4,946,778 (Ladner et al,), issued 
August 7, 1990, and incorporated herein by reference. The single-chain 
antigen-binding proteins produced under the process recited in U.S. Patent 
4.946,778 have binding specificity and affinity substantially similar to that of 
the corresponding Fab fragment. 

Bifunctional, or bispecific, antibodies have antigen binding sites of 
different specificities. Bispecific antibodies have been generated to deliver 
cells, cytotoxins, or drugs to specific sites. An important use has been to 
deliver host cytotoxic cells, such as natural killer or cytotoxic T cells, to 
specific cellular targets. (U.D. Staerz, O. Kanagawa, M.J. Bevan, Nature 
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314:628 (1985); S. Songilvilai, P.J. Lachmann, Clin, Exp. Immunol. 79: 315 
(1990)). Another important use has been to deliver cytotoxic proteins to 
specific cellular targets. (V. Raso, T. Griffin, Cancer Res. 41:2073 (1981); 
*S. Honda, Y, Ichimori, S. Iwasa, Cytotechnology 4:59 (1990)). Another 
important use has been to deliver anti-cancer non-protein drugs to specific 
cellular targets (J. Corvalan, W. Smith, V. Gore, IntL J. Cancer Suppl. 2:22 
(1988); Pimm etal.^ British J. of Cancer 67:508 (1990)), Such bispecific 
antibodies have been prepared by chemical cross-linking (M. Brennan et aL, 
Science 229:81 (1985)), disulfide exchange, or die production of hybrid- 
hybridomas (quadromas). Quadromas are constructed by fusing hybridomas 
that secrete two different types of antibodies against two different antigens 
(Kurokawa, T. et aL, Biotechnology 7:1163 (1989)). 



Summary of the Invention 



This invention relates to the discovery tiiat multivalent forms of single- 
chain antigen-binding proteins have significant utility beyond that of the 
monovalent single-chain antigen-binding proteins. A multivalent antigen- 
binding protein has more than one antigen-binding site. Enhanced binding 
activity, di- and multi-specific binding, and other novel uses of multivalent 
antigen-binding proteins have been demonstrated or are envisioned here. 
Accordingly, the invention is directed to multivalent forms of single-chain 
antigen-binding proteins, compositions of multivalent and single-chain antigen- 
binding proteins, methods of making and purifying multivalent forms of single- 
chain antigen-binding proteins, and uses for multivalent forms of single-chain 
antigen-binding proteins. The invention provides a multivalent antigen-binding 
protein comprising two or more single-chain protein molecules, each single- 
cham molecule comprising a first polypeptide comprising the binding portion 
of the variable region of an antibody heavy or light chain; a second 
polypeptide comprising the binding portion of the variable region of an 
antibody heavy or light chain; and a peptide linker linking the first and second 
polypeptides into a single-chain protein. 
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Also provided is a composition comprising a multivalent antigen- 
binding protein substantially free of single-chain molecules. 

Also provided is an aqueous composition comprising an excess of 



A method of producing a multivalent antigen-binding protein is 
provided, comprising the steps of producing a composition comprising 
multivalent antigen-binding protein and single-chain molecules, each single- 
chain molecule comprising a first polypeptide comprising the binding portion 
of the variable region of an antibody heavy or light chain; a second 
polypeptide comprising the binding portion of the variable region of an 
antibody heavy or light chain; and a peptide linker linking the first and second 
polypeptides into a single-chain molecule; separating the multivalent protein 
from the single-chain molecules; and recovering the multivalent protein. 

Also provided is a method of producing multivalent antigen-binding 
protein, comprising the steps of producing a composition comprising single- 
chain molecules as previously defined; dissociating the single-chain molecules; 
reasspciating the single-chain molecules; separating the resulting multivalent 

r 

antigen-binding proteins from the single-chain molecules; and recovering the 
multivalent proteins. 

Also provided is another method of producing a multivalent antigen- 
binding protein, comprising the step of chemically cross-linking at least two 
single-chain antigen-binding molecules. 

. Also provided is another method of producing a multivalent antigen- 
binding protein, comprising the steps of producing a composition comprising 
single-chain molecules as previously defined; concentrating said single-chain 
molecules; separating said multivalent protein from said single-chain 
molecules; and finally recovering said multivalent protein. 

Also provided is another method of producing a multivalent antigen- 
binding protein comprising two or more single-chain molecules, each single- 
chain molecule as previously defined, said method comprising: providing a 
genetic sequence coding for said single-chain molecule; transforming a host 



multivalent antigen-binding protein over single-chain molecules. 
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cel I or cells with said sequence; expressing said sequence in said host or hosts; * 
and recovering said multivalent protein. 

Another aspect of the invention includes a method of detecting an 
antigen in or suspected of being in a sample, which comprises contacting said 
sample with the multivalent antigen-binding protein of claim 1 and detecting 
whether said multivalent antigen-binding protein has bound to said antigen. 

Another aspect of the invention includes a method of imaging the 
internal structure of an animal, comprising adnunistering to said animal an 
effective amount of a labeled form of the multivalem antigen-binding protein 
of claim 1 and measuring detectable radiation associated with said animal. 

Anoflier aspect of die invention includes a composition comprising an 
association of a multivalent antigen-binding protein witii a theiapeutically or 
diagnostically effective agent. 

Another aspect of this invention is a single-chain protein comprising: 
a first polypeptide comprising tiie binding portion of die variable region of an 
antibody light chain; a second polypeptide comprising the binding portion of 
die variable region of an antibody light chain; a peptide linker linking said first 
and second polypeptides (a) and (b) into said single-chain protein. 

Anotiier aspect of the present invention includes the genetic 
constructions encoding the combinations of regions Vl-Vl and Vh-Vh for 
single-chain molecules, and encoding multivalent antigen-binding proteins. 

Anoflier part of tiiis invention is a multivalent single-chain antigen- 
binding protein comprising: a first polyp^tide comprising tiie binding portion 
of the variable region of an antibody heavy or light chain; a second 
polypeptide comprising the binding portion of tiie variable region of an 
antibody heavy or light chain; a peptide linker linking said first and second 
polypeptides (a) and (b) into said multivalent protein; a third polypeptide 
comprising the binding portion of die variable region of an antibody heavy or 
light chain; a fourth polypeptide comprising die binding portion of the variable 
region of an antibody heavy or light chain; a peptide linker linking said tiiird 
and fourth polypeptides (d) and (e) into said multivalent protein; and a peptide 
linker linking said second and tiiird polypeptides (b) and (d) into said 
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multivalent protein. Also included are gentic constructions coding for this 
multivalent single-chain antigen-binding protein. 

Also included are replicable cloning or expression vehicles including 
plasmids, hosts transformed with the aforementioned genetic sequences, and 
methods of producing multivalent proteins with the sequences, transformed 
hosts, and expression vehicles. 

Methods of use are provided, such as a method of using the multivalent 
antigen-binding protein to diagnose a medical condition; a method of using the 
multivalent protein as a carrier to image the specific bodily organs of an 
animal; a therapeutic method of using the multivalent protein to treat a medical 
condition; and an immunotherapeutic method of conjugating a multivalent 
protein with a therapeutically or diagnostically effective agent. Also included 
are labelled multivalent proteins, improved immunoassays using them, and 
improved immunoaffinity purifications. 

An advantage of using multivalent antigen-binding proteins instead of 
single-chain antigen-binding molecules or Fab fragments lies in the enhanced 
binding ability of the multivalent form. Enhanced binding occurs because the 
multivalent form has more binding sites per molecule. Another advantage of 
the present invention is the ability to use multivalent antigen-binding proteins 
as multi-specific binding molecules. 

An advantage of using multivalent antigen-binding proteins instead of 
whole antibodies, is the enhanced clearing of the multivalent antigen-binding 
proteins from the serum due to their smaller size as compared to whole 
antibodies which may afford lower background in imaging applications. 
Multivalent antigen-binding proteins may penetrate solid tumors better than 
monoclonals, resulting in better tumor-fighting ability. Also, because they are 
smaller and lack the Fc component of intact antibodies, the multivalent 
antigen-binding proteins of the present invention may be less immunogenic 
than whole antibodies. The Fc component of whole antibodies also contains 
binding sites for liver, spleen and certain other cells and its absence should 
thus reduce accumulation in non-target tissues. 
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Another advantage of multivalent antigen-binding proteins is the ease 
with which they may be produced and engineered, as compared to the 
myeloma-fusing technique pioneered by Kohler and Milstein that is used to 
produce whole antibodies. 

Brief Description of the Drawings. 

The present invention as defined in the claims can be better understood 
with reference to the text and to the following drawings: 

FIG. lA is a schematic two-dimensional representation of two identical 
single-chain antigen-binding protein molecules, each comprising a variable 
light chain region (Vl), a variable heavy chain region (V„), and a polypeptide 
linker joining the two regions. The single-chain antigen-binding protein 
molecules are shown binding antigen in their antigen-binding sites. 

FIG. IB depicts a hypothetical homodivalent antigen-binding protein 
formed by association of the polypeptide linkers of two monovalent single- 
chain antigen-binding proteins from Fig. lA (the Association model). The 
divalent antigen-binding protein is formed by. the concentration-driven 
association of two identical single-chain antigen-binding protein molecules. 

FIG. IC depicts the hypothetical divalent protein of HG. IB with 
bound antigen molecules occupying both antigen-binding sites. 

FIG- 2A depicts the hypothetical homodivalent protein of Figure IB. 
FIG. 2B depicts three single-chain antigen-binding protein molecules 
associated in a hypothetical trimer. 

FIG- 2C depicts a hypothetical tetramer of four single-chain antigen- 
binding protein molecules. 

FIG. 3A depicts two separate and distinct monovalent single-chain 
antigen-binding proteins, Anti-A single-chainantigen-binding protein and Anti- 
B single-chain antigen-binding protein, with different antigen specificities, each 
individually binding either Antigen A or Antigen B. 
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FIG. 3B depicts a hypothetical bispecific heterodivalent antigen-binding 
protein formed from the single-chain antigen-binding proteins of Fig. 3A 
according to the Association model. 

FIG. 3C depicts the hypothetical heterodivalent antigen-binding protein 
of FIG. 3B binding bispecifically, i.e., binding the two different antigens, A 
and B. 

FIG. 4A depicts two identical single-chain antigen-binding protein 
molecules, each having a variable light chain region (V^), a variable heavy 
chain region (Vh), and a polypeptide linker joining the two regions. The 
single-chain antigen-binding protein molecules are shown binding identical 
antigen molecules in their antigen-binding sites. 

FIG. 4B depicts a hypothetical homodivalent protein formed by the 
rearrangement of the and regions shown in FIG. 4 A (the 
Rearrangement model). Also shown is bound antigen. 

FIG. 5A depicts two single-chain protein molecules, the first having an 
anti-B Vl and an anti-A and the second having an anti-A Vl and an anti-B 
V^. The figure shows the non-complementary nature of the and Vh 
regions in each single-chain protein molecule. 

FIG. 5B shows a hypothetical bispecific heterodivalent antigen-binding 
protein formed by rearrangement of the two single-chain proteins of Figure 



FIG. 5C depicts the hypothetical heterodivalent antigen-binding protein 
of FIG. 5B with different antigens A and B occupying their respective antigen- 
binding sites. 

FIG. 6 A is a schematic depiction of a hypothetical trivalent antigen- 
binding protein according to the Rearrangement model. 

FIG. 6B is a schematic depiction of a hypothetical tetravalent antigen- 
binding protein according to the Rearrangement model. 

RG. 7 is a chromatogram depicting the separation of CC49/212 
antigen-binding protein monomer from dimer on a cation exchange high 
performance liquid chromatographic column. The column is a PolyCAT A 



5A. 
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aspartic acid column (Poly WC, Columbia, MD). Monomer is shown as Peak 
1, elating at 27.32 min., and dimer is shown as Peak 2, elating at 55.52 min. 

FIG. 8 is a chromatogram of the purified monomer from Fig. 1, 
Monomer elutes at 21.94 min., preceded by dimer (20.135 min.) and trimer 
(18.640 min.). Gel filtration column, Protein-Pak 300SW (Waters Associates. 
Milford, MA). 

FIG. 9 is a similar chromatogram of purified dimer (20.14 min.) from 
Fig. 7, run on the gel filtration HPLC column of Fig. 8. 

HG. iOA is an amino acid (SEQ ID NO. 11) and nucleotide (SEQ ID 
NO. 10) sequence of flie single-chain protein comprising the 4-4-20 region 
connected through the 212 linker polypeptide to the CC49 Vh region. - 

HG. lOB is an amino acid (SEQ ID NO. 13) and nucleotide (SEQ ID 
NO. 12) sequence of the single-chain protein comprising the CC49 region 
connected through the 212 linker polypeptide to the 4-4-20 Vh region. 

FIG. 11 is a chromatogram depicting the separation of the monomer 
(27.83 min.) and dimer (50.47 min.) forms of the CC49/212 antigen-binding 
protein by cation exchange, on a PolyCAT A cation exchange column (Poly 
LC, Columbia, MD). 

Fig. 12 shows the separation of monomer (17.65 min.), dimer (15.79 
min.), trimer (14.19 min.), and higher oligomers (shoulder at about 13.09 
min.) of the B6.2/212 antigen-binding protein. This separation depicts the 
results of a 24-hour treatment of a 1.0 mg/ml B6.2/212 single-chain antigen- 
binding protein sample. A TSK G2000SW gel filtration HPLC column was 
used, Toyo Soda, Tokyo, Japan. 

Fig. 13 shows the results of a 24-hour treatment of a 4.0 mg/ml 
CC49/212 antigen-binding protein sample, generating monomer, dimer, and 
trimer at 16.91. 14.9, and 13.42 min., respectively. The same TSK gel 
filtration column was used as in Fig. 12. 

Fig. 14 shows a schematic view of the four-chain structure of a human 
IgG molecule. 
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Fig. 15A is an amino acid (SEQ ID NO. 15) and nucleotide (SEQ ID 
NO. 14) sequence of the 4-4-20/2.12 single-chain antigen-binding protein with 
a single cysteine hinge. 

Fig. 15B is an amino acid (SEQ ID NO. 17) and nucleotide (SEQ. ID 
NO. 16) sequence of the 4-4-20/212 single-chain antigen-binding protein with 
L^e two-cysieine hinge. 

Fig, 16 shows the amino acid (SEQ ID NO. 19) and nucleotide (SEQ 
U> SO. 18) sequence of a divalent CC49/212 single-chain antigen-binding 



Fig. 17 shows the expression of the divalent CC49/212 single-chain 
antigen-binding protein of Fig. 16 at 42°C, on an SDS-PAGE gel containing 
total E. coU protein. Lane 1 contains the molecular weight standards. Lane 
2 IS the uninduced E. coli production strain grown at 30°C. Lane 3 is divalent 
CC49/212 single-chain antigen-binding protein induced by growth at 42 ""C. 
The arrow shows the band of expressed divalent CC49/212 single-chain 
antigen-binding protein. 

Fig. 18 is a graphical representation of four compedtion 
radioimmunoassays (RIA) in which unlabeled CC49 IgG (open circles) 
CC49/212 single-chain antigen-binding protein (closed circles) and CC49/212 
divalent antigen-binding protein (closed squares) and anti-fluorescein 4-4- 
20/212 single-chain antigen-binding protein (open squares) competed against 
a CC49 IgG radiolabeled with *^I for binding to the TAG-72 antigen on a 
human breast carcinoma extract. 

Figure 19A is an amino acid (SEQ ID NO. 21) and nucleotide (SEQ 
ID NO. 20) sequence of the single-chain polypeptide comprising the 4-4-20 
region connected through the 217 linker polypeptide to the CC49 Vh region. 

Figure 19B is an amino acid (SEQ ID NO. 23) and nucleotide (SEQ 
ID NO. 22) sequence of the single-chain polypeptide comprising the CC49 Vl 
region connected through the 217 linker polypeptide to the 4-4-20 Vh region. 

Figure 20 is a chromatogram depicting the purification of CC49/4-4-20 
heterodimer Fv on a cation exchange high performance liquid chromatographic 
column. The column is a PolyCAT A aspartic acid column (Poly LC, 



prixcin. 
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Columbia, MD)^ The heterodimer Fv is shown as peak 5, eluting at 30.10 
min. 

Figure 21 is a coomassie-biue stained 4-20% SDS-PAGE gel showing 
the proteins separated in Figure 20. Lane 1 contains the molecular weight 
standards. Lane 3 contains the starting material before s^aration. Lanes 4-8 
contain firactions 2, 3, 5, 6 and 7 respectively. Lane 9 contains purified 
CC49/212. 

Figure 22A is a chromatogram used to determine the molecular size of 
fraction 2 from Rgure 20. A TSK G3000SW gel filtration HPLC column was 
used (Toyo Soda, Tokyo, Japan). 

Figure 22B is a chromatogram used to determine the molecular size of 
fraction 5 from Figure 20. A TSK G3000SW gel filtration HPLC column was 
used (Toyo Soda, Tokyo, Japan). 

Figure 22C is a chromatogram used to determine the molecular size of 
fraction 6 from Figure 20. A TSK G30005W gel filtration HPLC column was 
used (Toyo Soda, Tokyo, Japan). 

Hgure 23 shows a Scatcbard analysis of the fluorescein binding affinity 
of the CC49 4-4-20 heterodimer Fv (fraction 5 in Rgure 20). 

Hgure 24 is a graphical rqiresentation of three competition enzyme- 
linked immunosorbent assays (ELJSA) in which unlabeled CC49 4-4-20 Fv 
(closed squares) CC49/212 single-chain Fv (open squares) and MOPC-21 IgG 
(+) competed against a biotin-labeled CC49 IgG for binding to the TAG-72 
antigen on a human breast carcinoma extract MOPC-21 is a control antibody 
that does not bind to TAG-72 antigen. 

Figure 25 shows a coomassie-blue stained non-reducing 4-20% SDS- 
PAGE gel. Lanes 1 and 9 contain the molecular weight standards. Lane 3 
contains the 4-4-20/212 CPPC single-chain antigen-binding protein after 
purification. Lane 4, 5 and 6 contain the 4-4-20/212 CPPC single-chain 
antigen-binding protein after treatment with DTT and air oxidation. Lane 7 
contains 4-4-20/212 single-chain antigen-binding protein. 

Figure 26 shows a coomassie-blue stained reducing 4-20% SDS-PAGE 
gel (samples were treated with jS-mercaptoefhanol prior to being loaded on the 
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gel). Lanes 1 and 8 contain the molecular weight standards. Lane 3 contains 
the 4-4-20/212 CPPC single-chain antigen-binding protein after treatment with 
to-maleimidehexane. Lane 5 contains peak 1 of Z^/^-maleimidehexane treated 
4-4-20/212 CPCC single-chain antigen-binding protein. Lane 6 contains peak 
3 of fr/5-maleimidehexane treated 4-4-20/212 CPPC single-chain antigen- 
binding protein. 

Detailed Description of the Preferred Embodiments 

This invention relates to the discovery that multivalent forms of single- 
chain antigen-binding proteins have significant utility beyond that of the 
monovalent single-chain antigen-binding proteins. A multivalent antigen- 
binding protein has more than one antigen-binding site. For the purposes of 
this application, "valent" refers to the numerosity of antigen binding sites. 
Thus, a bivalent protein refers to a protein with two binding sites. Enhanced 
binding activity, bi- and multi*specific binding, and other novel uses of 
multivalent antigen-binding proteins have been demonstrated or are envisioned 
fhere. Accordingly, the invention is directed to multivalent forms of single- 
chain antigen-binding proteins, compositions of multivalent and single-chain 
antigen-binding proteins, methods of making and purifying multivalent forms 
of single-chain antigen-binding proteins, and new and improved uses for 
multivalent forms of single-chain antigen-binding proteins. The invention 
provides a multivalent antigen-binding protein comprising two or more single- 
chain protein molecules, each single-chain molecule comprising a first 
polypeptide comprising the binding portion of the variable region of an 
antibody heavy or light chain; a second polypeptide comprising the binding 
portion of the variable region of an antibody heavy or light chain; and a 
peptide linker linking the first and second polypeptides into a single-chain 
protein. 

The term "multivalent" means any assemblage, covalently or non- 
covalently joined, of two or more single-chain proteins, the assemblage having 
more than one antigen-binding site. The single-chain proteins composing the 
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a^^mblage may have antigen-binding activity, or they may lack antigen- 
bmdmg activity individually but \>c capable of assembly into active multivalent 
antigen-binding proteins. The tenn "multivalent" encompasses bivalent, 
tnvalent, tetravalent, etc. It is envisioned that multivalent forms above 
bivalent may be useful for certain applications. 

A piefened form of the multivalent antigen-binding protein comprises 
bivalent proteins, including heterobivalent and homobivalent forms. The term 
"buTilcnt" means an assemblage of single-chain proteins associated with each 
other to form two antigen-binding sites. The term "heterobivalent" indicates 
multivalent andgen-binding proteins that are biq)ecific molecules capable of 
binding to two different antigenic determinants. Therefore, heterobivalent 
proteins have two antigen-binding sites that have different binding specificities. 
The term "homobivalent" indicates that the two binding sites are for the same 
antigenic determinant 

The terms "single-chain molecule" or "single-chain protein" are used 
interchangeably here. They are structurally defined as comprising the binding 
portion of a first polypeptide firom the variable region of an antibody, 
associated with the binding portion of a second polypeptide from the variable 
region of an antibody, the two polypeptides being joined by a peptide linker 
linking the first and second polypeptides into a single polypeptide ch^n. The 
single polypeptide chain thus comprises a pair of variable regions coimected 
by a polypeptide linker. The regions may associate to form a functional 
antigen-binding site, as in the case wherein the regions comprise a light-chain 
and a heavy-chain variable region pair with appropriately paired 
complementarity determining regions (CDRs). In this case, the single-chain 
protein is referred to as a "single-chain antigen-binding protein" or "single- 
chain antigen-binding molecule. " 

Alternatively, the variable regions may have unnaturally paired CDRs 
or may both be derived from the same kind of antibody chain, either heavy or 
light, in which case the resulting single-chain molecule may not display a 
functional antigen-binding site. The single-chain antigen-binding protein 
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molecule is more fully described in U.S. Patent No. 4,946,778 (Ladner et aL), 
and incorporated herein by reference. 

Without being bound by any particular theory, the inventors speculate 



The inventors* models are presented herein for the purpose of illustration only, 
and are not to be construed as limitations upon the scope of the invention. 
The invention is useful and operable regardless of the precise mechanism of 
multivalence. 

Figure 1 depicts the first hypothetical model for the creation of a 
multivalent protein, the "Association" model. Fig. 1 A shows two monovalent 
single-chain antigen-binding proteins, each composed of a Vl, a Vh, and a 
linker polypeptide covalently bridging the two. Each monovalent single-chain 
antigen-binding protein is depicted having an identical antigen-binding site 
containing antigen. Figure IB shows the simple association of the two single- 
chain antigen-binding proteins to create the bivalent form of the multivalent 
protein. It is hypothesized that simple hydrophobic forces between the 
monovalent proteins are responsible for their association in this manner. The 
origin of the multivalent proteins may be traceable to their concentration 
dependence. The monovalent units retain their original association between 
the V„ and Vl regions. Figure IC shows the newly-formed homobivalent 
protein binding two identical antigen molecules simultaneously. Homobivalent 
antigen-binding proteins are necessarily monospecific for antigen. 

Homovalent proteins are depicted in Figs. 2A through 2C formed 
according to the Association model. Fig. 1 A depicts a homobivalent protein. 
Fig. 2B a trivalent protein, and Fig. 2C a tetravalent protein. Of course, the 
limitations of two-dimensional images of three-dimensional objects must be 
taken into account. Thus, the actual spatial arrangement of multivalent 
proteins can be expected to vary somewhat from these figures. 

A heterobivalent antigen-binding protein has two different binding sites, 
the sites having different binding specificities. Figures 3A through C depict 
the Association model pathway to the creation of a heterobivalent protein. 
Figure 3A shows two monovalent single-chain antigen-binding proteins. Ami- 



on several models which can equally explain the phenomenon of multivalence. 
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A single-chain antigen-binding protein and Anti-B single-chain antigen-binding 
protein, with antigen types A and B occupying Ae respective binding sites. 
Figure 3B depicts the heterobivalent protein formed by the simple association 
of the original monovalent proteins. Figure 3C shows the heterobivalent 
protein having bound antigens A and B into the antigen-binding sites. Figure 
3C therefore shows the heterobivalent protein binding in a bispeciflc manner. 

An alternative model for the formation of multivalent antigen-binding 
proteins is shown in Figures 4 through 6. This "Rearrangement" model 
hypothesizes the dissociation of the variable region interface by contact with 
dissociating agents such as guanidine hydrochloride, urea, or alcohols such as 
ethanol, either alone or in combination. Combinations and relevant 
concentration ranges of dissociating agents are recited in the discussion 
concerning dissociating agents, and in Example 2. Subsequent re-association 
of dissociated regions allows variable region recombination differing from the 
starting single-chain proteins, as depicted in Fig. 4B. The homobivalent 
a-itigen-binding protein of Figure 4B is formed from the parent single-chain 
antigen-binding proteins shown in Figure 4A, the recombined bivalent protein 
having and Vh from the parent monovalent single-chain proteins. The 
homobivalent protein of Rgure 4B is a fully functional monospecific bivalent 
protein, shown actively binding two antigen molecules. 

Figures 5A-5C show the formation of heterobivalent antigen-binding 
proteins via the Rearrangement model. Figure 5A shows a pair of single- 
chain proteins, each having a Vl with complementarity determining regions 
(CDRs) that do not match those of the associated Vh- These single-chain 
proteins have reduced or no ability to bind antigen because of the mixed 
nature of their antigen-binding sites, and thus are made specifically to be 
assembled into multivalent proteins through this route. Figure 5B shows tiie 
heterobivalent antigen-binding protein formed whereby die Vh and regions 
of the parent proteins are shared between the separate halves of the 
heterobivalent protein. Figure 5C shows tiie binding of two different antigen 
molecules to the resultant functional bispecific heterobivalent protein. The 
Rearrangement model also explains the generation of multivalent proteins of 
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a higher order than bivalent, as it can be appreciated that more than a pair of 
single-chain proteins can be reassembled in this manner. These are depicted 
in Figures 6A and 6B. 

One of the major utilities of the multivalent antigen-binding protein is 
in the heterobivalent form, in which one specificity is for one type of hapten 
or antigen, and the second specificity is for a second type of hapten or 
antigen. A multivalent molecule having two distinct binding specificities has 
many potential uses. For instance, one antigen binding site may be specific 
for a cell-surface epitope of a target cell, such as a tumor cell or other 
undesirable cell. The other antigen-binding site may be specific for a cell- 
surface epitope of an effector cell, such as the CDS protein of a cytotoxic T- 
celL In this way, the heterobivalent antigen-binding protein may guide a 
cytotoxic cell to a particular class of cells that are to be preferentially 
attacked. 

Other uses of heterobivalent antigen-binding proteins are the specific 
targeting and destruction of blood clots by a bispecific molecule with 
specificity for tissue plasminogen activator (tPA) and fibrin; the specific 
-targeting of pro-drug activating enzymes to tumor cells by a bispecific 
molecule with specificity for tumor cells and enzyme; and specific targeting 
of cytotoxic proteins to tumor cells by a bispecific molecule with specificity 
for tumor cells and a cytotoxic protein. This list is illustrative only, and any 
use for which a multivalent specificity is appropriate comes within the scope 
of this invention. 

The invention also extends to uses for the multivalent amtigen-binding 
proteins in purification and biosensors. Affinity purification is made possible 
by affixing the multivalent antigen-binding protein to a support, with the 
antigen-binding sites exposed to and in contact with the ligand molecule to be 
separated, and thus purified. Biosensors generate a detectable signal upon 
binding of a specific antigen to an antigen-binding molecule, with subsequent 
processing of the signal. Multivalent antigen-binding proteins, when used as 
the antigen-binding molecule in biosensors, may change conformation upon 
binding, thus generating a signal that may be detected. 
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Essentially all of the uses for which monoclonal or polyclonal 
antibodies, or fragments thereof, have been envisioned by the prior art, can 
be addressed by the multivalent proteins of the present invention. These uses 
include detectably-labelled forms of the multivalent protein. Types of labels 
are well-known to those of ordinary skill in the art. They include 
radiolabelling, chemiluminescent labeling, fluorochromic labelling, and 
chromophoric labeling. Other uses include imaging the internal structure of 
an animal (including a human) by administering an effective amount of a 
labelled form of the multivalent protein and measuring detectable radiation 
associated with the animal. They also include improved immunoassays, 
including sandwich immunoassay, competitive immunoassay, and other 
immunoassays wherein the labelled antibody can be replaced by the 
multivalent antigen-binding protein of this invention. 

A first preferred method of producing multivalent antigen-binding 
proteins involves separating the multivalent proteins from a production 
comp >sition that comprises both multivalent and single-chain proteins, as 
represented in Example 1. The method comprises producing a composition 
of multivalent and single-chain proteins, sq>afating the multivalent proteins 
from the single-chain proteins, and recovering the multivalent proteins. 

A second preferred method of producing multivalent antigen-binding 
proteins comprises the steps of producing single-chain protein molecules, 
dissociating said single-chain molecules, reassociating the single-chain 
molecules such that a significant fraction of the resulting composition includes 
multivalent forms of the single-chain antigen-binding proteins, separating 
multivalent antigen-binding proteins from single-chain molecules, and 
recovering the multivalent proteins. This process is illustrated with more 
detail in Example 2. For the purposes of this method, the term "producing a 
composition comprising single-chain molecules" may indicate the actual 
production of these molecules. The term may also include procuring them 
from whatever commercial or institutional source makes them available. Use 
of the term "producing single-chain proteins" means production of single-chain 
proteins by any process, but preferably according to the process set forth in 
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U-S, Patent No. 4,946,778 (Ladner et aL). Briefly, that patent pertains to a 
single polypeptide chain antigen-binding molecule which has binding 
specificity and affinity substantially similar to the binding specificity and 
affinity of the aggregate light and heavy chain variable regions of an antibody, 
to genetic sequences coding therefore, and to recombinant DNA methods of 
producing such molecules, and uses for such molecules. The single-chain 
protein produced by the Ladner et al. methodology comprises two regions 
linked by a linker polypeptide. The two regions are termed the V„ and 
regions, each region comprising one half of a functional antigen-binding site. 

The term "dissociating said single-chain molecules" means to cause the 
physical separation of the two variable regions of the single-chain protein 
without causing denaturation of the variable regions. 

"Dissociating agents" are defined herein to include all agents capable 
of dissociating the variable regions, as defined above. In the context of this 
invention, the term includes the well-known agents alcohol (including ethanol), 
guanidine hydrochloride (GuHCI), and urea. Others will be apparent to those 
of ordinary skill in the art, including detergents and similar agents capable of 
interoipting the interactions that maintain protein conformation. In the 
preferred embodiment, a combination of GuHCl and ethaiiol (EtOH) is used 
as the dissociating agent. A preferred range for ethanol and GuHCl is from 
0 to 50% EtOH, vol/vol, 0 to 2.0 moles per liter (M) GuHCl. A more 
preferred range is from 10-30% EtOH and 0.5-1.0 M GuHCl, and a most 
preferred range is 20% EtOH, 0.5 M GuHCl. A preferred dissociation buffer 
contains 0.5 M guanidine hydrochloride, 20% ethanol, 0.05 M TRIS, and 
0.01 M CaCl^, pH 8.0. 

Use of the term "re-associating said single-chain molecules" is meant 
to describe the reassociation of the variable regions by contacting them with 
a buffer solution that allows reassociation. Such a buffer is preferably used 
in the present invention and is characterized as being composed of 0.04 M 
MOPS, 0.10 M calcium acetate, pH 7.5. Otiier buffers allowing the 
reassociation of the Vl and Vh regions are well within the expertise of one of 
ordinary skill in the art. 
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The separation of the multivalent protein from the single-chain 
molecules occurs by use of standard techniques known in the art, particularly 
including cation exchange or gel filtration chromatography. 

Cation exchange chromatography is the general liquid chromatographic 
technique of ion-exchange chromatography utilizing anion columns well-known 
to those of ordinary skill in the art. In this invention, the cations exchanged 
are the single-chain and multivalent protein molecules. Since multivalent 
proteins will have some multiple of the net charge of the single-chain 
molecule, the multivalent proteins are retained more strongly and are thus 
separated from the single-chain molecules. The preferred cationic exchanger 
of the present invention is a polyaspartic acid column, as shown in Figure 7. 
Figure 7 depicts the separation of single-chain protein (Peak 1, 27.32 min.) 
from bivalent protein (Peak 2, 55.54 min.) Those of ordinary skill in the art 
will realize that the invention is not limited to any particular type of 
chromatography column, so long as it is capable of separating the two forms 
of protein molecules. 

Gel filtration chromatography is the use of a gel-like material to 
separate proteins on the basis of their molecular weight. A "gel" is a matrix 
of water and a polymer, such as agarose or polymerized acrylamide. The 
present invention encompasses the use of gel filtration HPLC (high 
performance liquid chromatography) , as will be appreciated by one of ordinary 
skill in the arL Figure 8 is a chromatogram depicting the use of a Waters 
Associates' Protein-Pak 300 SW gel filtration column to separate monovalent 
single-chain protein from multivalent protein, including the monomer (21.940 
min.), bivalent protein (20.135 min.), and trivalent protein (18.640 min.). 

Recovering the multivalent antigen-binding proteins is accomplished by 
standard collection procedures well known in the chemical and biochemical 
arts. In the context of the present invention recovering the multivalent protein 
preferably comprises collection of eluate fractions containing the peak of 
interest from eidier the cation exchange column, or the gel filtration HPLC 
column. Manual and automated fraction collection are well-known to one of 
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ordinary skill in the art. Subsequent processing may involve lyophilization of 
the eluate to produce a stable solid, or further purification. 

A third preferred method of producing multivalent antigen-binding 



and then increase the concentration until some significant fraction of 
multivalent proteins is formed. The multivalent proteins are then separated 
and recovered. The concentrations conducive to formation of multivalent 
proteins in this manner are from about 0.5 milligram per milliliter (mg/ml) to 
the concentration at which precipitates begin to form. 

The use of the term "substantially free" when used to describe a 
composition of multivalent and single-chain antigen-binding protein molecules 
means the lack of a significant peak corresponding to the single-chain 
molecule, when the composition is analyzed by cation exchange 
chromatography, as disclosed in Example 1 or by gel filtration 
chromatography as disclosed in Example 2. 

By use of the term "aqueous composition" is meant any composition 
of single-chain molecules and multivalent proteins including a portion of 
water.' In the same context, the phrase "an excess of multivalent antigen- 
binding protein over single-chain molecules" indicates that the composition 
comprises more than 50% of multivalent antigen-binding protein. 

The use of the term "cross-linking" refers to chemical means by which 
one can produce multivalent antigen-binding proteins from monovalent single- 
chain protein molecules. For example, the incorporation of a cross-linkable 
sulfhydryl chemical group as a cysteine residue in the single-chain proteins 
allows cross-linking by mild reduction of the sulfhydryl group. Both 
monospecific and multispecific multivalent proteins can be produced from 
single-chain proteins by cross-linking the free cysteine groups from two or 
more single-chain proteins, causing a covalent chemical linkage to form 
between the individual proteins. Free cysteines have been engineered into the 
C-terminal portion of the 4-4-20/212 single-chain antigen-binding protein, as 
discussed in Example 5 and Example 8. These free cysteines may then be 
cross-linked to form multivalent antigen-binding proteins. 



proteins is to start with purified single-chain proteins at a lower concentration. 



wo 93/11 161 



PCr/US92/09965 



-22- 

The invention also cx>mprises single-chain proteins, comprising: (a) a 
first polypeptide comprising the binding portion of the variable region of an 
antibody light chain; (b) a second polypeptide comprising the binding portion 
of the variable region of an antibody light chain; and (c) a peptide linker 
linking said first and second polypeptides (a) and (b) into said single-chain 
protein. A similar single-chain protein comprising the heavy chain variable 
regions is also a part of this invention. Genetic sequences encoding these 
molecules are also included in the scope of tiiis invention. Since these proteins 
are comprised of two similar variable regions, they do not necessarily have 
any antigen-binding c^ability. 

The invention also includes a DNA sequence encoding a bispecific 
bivalent antigen-binding protein. Example 4 and Example 7 discusses in detail 
tiie sequences that appear in Pigs. lOA and lOB that allow one of ordinary 
skill to construct a heterobivalent antigen-binding molecule. Figure lOA is an 
amino acid and nucleotide sequence listing of the single-chain protein 
comprising the 4 4-20 Vl region connected through the 212 linker polypeptide 
to the CC49 Vh region. Figure lOB is a similar listing of the single-chain 
protein comprising the CC49 Vl region connected through the 212 linker 
polypeptide to the 4-4-20 Vh region. Subjecting a composition including these 
single-chain molecules to dissociating and subsequent re-associating conditions 
results in the production of a bivalent protein with two different binding 
specificities. 

Synthesis of DNA sequences is well know in the art, and possible 
through at least two routes. First, it is well-known that DNA sequences may 
be synthesized through the use of automated DNA synthesizers de novo, once 
the primary sequence information is known. Alternatively, it is possible to 
obtain a DNA sequence coding for a multivalent single-chain antigen-binding 
protein by removing the stop codons from the end of a gene encoding a single- 
chain antigen-binding protein, and then inserting a linker and a gene encoding 
a second single-chain antigen-binding protein. Example 6 demonstrates the 
construction of a DNA sequence coding for a bivalent single-chain antigen- 
binding protein. Other methods of genetically constructing multivalent single- 
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chain antigen-binding proteins come within the spirit and scope of the present 
invention. 

Having now generally described this invention the same will better be 
understood by reference to certain specific examples which are included for 
purposes of illustration and are not intended to limit it unless otherwise 
specifled. 

Example 1 

Production of Multivalent 
Antigen-Binding Proteins During Purification 

In the production of multivalent antigen-binding proteins, the same 
recombinant E. colt production system that was used for prior single-chain 
antigen-binding protein production was used. See Bird, etaL, Science 242:423 
(1988). This production system produced between 2 and 20% of the total E. 
coll protein as antigen-binding protein. For protein recovery, the frozen cell 
paste from three 10-liter fermentations (600-900 g) was thawed overnight at 
4°C and gently resuspended at 4^C in 50 mM Tris-Hcl, 1.0 mM EDTA, 100 
mM KCl, 0.1 mM PMSF, pH 8.0 (lysis buffer), using 10 liters of lysis buffer 
for every kilogram of wet cell paste. When thoroughly resuspended, the 
chilled mixture was passed three times through a Manton-Gaulin cell 
homogenizer to totally lyse the cells. Because the cell homogenizer raised the 
temperature of the cell lysate to 25 ±5°C, the cell lysate was cooled to 
5±2^C with a Lauda/Brinkman chilling coil after each pass. Complete lysis 
was verified by visual inspection under a microscope. 

The cell lysate was centrifuged at 24,300g for 30 min. at 6'*C using a 
Sorvall RC-5B centrifuge. The pellet containing the insoluble antigen-binding 
protein was retained, and the supernatant was discarded. The pellet was 
washed by gently scraping it from the centrifuge bottles and resuspending it 
in 5 liters of lysis buffer/kg of wet cell paste. The resulting 3.0- to 4.5-liter 
suspension was again centrifuged at 24,300g for 30 min at 6''C, and the 
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supematant was discarded. This washing of the pellet removes soluble E. coli 
proteins and can be repeated as many as five times. At any time during this 
washing procedure the material can be stored as a frozen pellet at -ZO'^C. A 
substantial time saving in the washing steps can be accomplished by utilizing 
a Pellicon tangential flow apparatus equipped with 0.22-/im microporous 
filters, in place of centrifiigation. 

The washed pellet was solubilized at 4^C in freshly prepared 6 M 
guanidine hydrochloride. 50 mM Tris-HCl, 10 mM CaCl2, 50 mM KCl, pH 
8.0 (dissociating buffer), using 9 ml/g of pellet. If necessary, a few quick 
pulses firom a Heat Systems Ultrasonics tissue homogenizer can be used to 
complete tiie solubilization. The resulting suspension was centrifiiged at 
24,300g for 45 min at 6''C and the pellet was discarded. The optical density 
of the supernatant was determined at 280 nm and if the ODjgo was above 30, 
additional dissociating buffer was added to obtain an OD280 of approximately 
25. 

The supematr nt was slowly diluted into cold (4-7'*C) refolding buffer 
(50 mM Tris-HCl, 10 mM CaCl^, 50 mM KCl, pH 8.0) until a 1:10 dilution 
was reached (final volume 10-20 liters). Re-folding occurs over s^proximately 
eighteen hours under these conditions. The best results are obtained when the 
GuHCl extract is slowly added to the refolding buffer over a 2-h period, wilh 
gentie mixing. The solution was left undisturbed for at least a 20-h period, 
and 95% ethanol was added to this solution such that the final ethanol 
concentration was approximately 20 % . This solution was left undisturbed until 
the flocculated material setded to the bottom, usually not less than sixQr 
minutes. The solution was filtered through a 0.2 um Millipore Millipak 200. 
This filtration step may be optionally preceded by a centrifugation step. TTie 
filtrate was concentrated to 1 to 2 liters using an Amicon spiral cartridge with 
a 10.000 MWCO cartridge, again at 4^C. 

The concentrated crude antigen-binding protein sample was dialyzed 
against Buffer A (60 mM MOPS, 0.5 mM Ca acetate, pH 6.0-6.4) until tile 
conductivity was lowered to that of Buffer A, The sample was then loaded on 
a 21.5 X 250-mm polyaspartic acid PolyCAT A column, manufactured by Poly 
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LC of Columbia, Maryland. If more than 60 mg of protein is loaded on this 
column, the resolution begins to deteriorate; thus, the concentrated crude 
sample often must be divided into several PolyCAT A runs. Most antigen- 
binding proteins have an extinction coefficient of about 2.0 ml mg'^ cm*^ at 
280 nm and this can be used to determine protein concentration. The antigen- 
binding protein sample was eluted from the PolyCAT A column with a 50-min 
linear gradient from Buffer A to Buffer B (see Table 1). Most of the single- 
chain proteins elute between 20 and 26 minutes when this gradient is used. 
This corresponds to an eluting solvent composition of approximately 70% 
Buffer A and 30% Buffer B. Most of the bivalent antigen-binding proteins 
elute later than 45 minutes, which correspond to over 90% Buffer B. 

Figure 7 is a chromatogram depicting the separation of single-chain 
protein from bivalent CC49/2 12 protein, using the cation-exchange method just 
described. Peak 1, 27.32 minutes, represents the monomeric single-chain 
fraction. Peak 2, 55.52 minutes, represents the bivalent protein fraction. 

Figure 8 is a chromatogram of the purified monomeric single-chain 
antigen-binding protein CC49/212 (Fraction 7 from Fig. 7) run on a Waters 
J Protein-Pak 300SW gel filtration column. Monomer, with minor contaminates 
of dimer and trimer, is shown. Figure 9 is a chromatogram of the purified 
bivalent antigen-binding protein CC49/212 (Fraction 15 from Fig. 7) run on 
the same Waters Proiein-Pak 300SW gel filtration column as used in Fig. 8. 
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cation exchange and gel filtration chromatography, can be used to separate the 
single-chain protein monomer from the multivalent antigen-binding proteins. 
In the first method, monomeric and multivalent antigen-binding proteins were 
separated by using cation exchange HPLC chromography, using a 
polyaspartate column (PolyCAT A). This was a similar procedure to that used 
in the final purification of the antigen-binding proteins as described in 
Example 1. The load buffer was 0.06 M MOPS, 0.001 M Calcium Acetate 
pH 6.4. In the second method, the monomeric and multivalent antigen- 
binding proteins were separated by gel filtration HPLC chromatography using 
as a load buffer 0.04 M MOPS, 0.10 M Calcium Acetate pH 7.5. Gel 
filtration chromatogr^hy separates proteins based on their molecular size* 

Once the antigen-binding protein sample was loaded on the cation 
exchange HPLC column, a linear gradient was run between the load buffer 
(0.04 to 0.06 M MOPS, 0.000 to 0.001 M calcium acetate, 0 to 10% glycerol 
pH 6.0-6.4) and a second buffer (0.04 to 0.06 M MOPS, O.OI to 0.02 M 
calcium acetate, 0 to 10% glycerol pH 7.5). It was important to have 
extensively dialyze the antigen-binding protein sample before loading it on the 
column. Normally, the conductivity of the sample is monitored against the 
dialysis buffer. Dialysis is continued until the conductivity drops below 600 
/iS. Figure 11 shows the separation of the monomeric (27.83 min.) and 
bivalent (50.47 min.) forms of the CC49/212 antigen-binding protein by cation 
exchange. The chromatographic conditions for this separation were as 
follows: PolyCAT A column, 200 x 4.6mm, operated at 0.62 ml/min.; load 
buffer and second buffer as in Example I; gradient program from 100 percent 
load buffer A to 0 percent load buffer A over 48 mins; sample was CC49/212, 
1.66 mg/ml; injection volume 0.2 mh Fractions were collected from the two 
peaks ffom a similar chromatogram and identified as monomeric and bivalent 
proteins using gel filtration HPLC chromatography as described below. 

Gel filtration HPLC chromatography (TSK G2000SW column fi^om 
Toyo Soda, Tokyo, Japan) was used to identify and separate monomeric 
single-chain and multivalent antigen-binding proteins. This procedure has 
been described by Fukano, et aL, /. Chromatography 166:47 (1978). 
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Multimerization (creation of multivalent protein from monomeric single-chain 
protein) was by treatment with 0,5 M GuHCI and 20% EtOH for the times 
indicated in Table 2A followed by dialysis into the chromatography buffer. 
Figure 12 shows the separation of monomeric (17.65 min.), bivalent (15.79 
min.), trivalent (14.19 min.), and higher oligomers (shoulder at about 13.09 
min.) of the B6.2/212 antigen-binding protein. The B6.2/212 single-chain 
antigen-binding protein is described in Colcher, D., et aL^ /. Nat. Cancer 
Inst. 52:1191-1197 (1990)). This separation depicts the results of a 24-hour 
multimerization treatment of a 1.0 mg/ml B6.2/212 antigen-binding protein 
sample. The HPLC buffer used was 0.04 M MOPS, 0. 10 M calcium acetate, 
0.04% sodium azide, pH 7.5. 

Figure 13 shows the results of a 24-hour treatment of a 4.0 mg/ml 
CC49/2 12 antigen-binding protein sample, generating monomeric, bivalentand 
trivalent proteins at 16.91, 14.9, and 13.42 min., respectively. The HPLC 
buffer was 40 mM MOPS, 100 mM calcium acetate, pH 7.35. 
Multimerization treatment was for the times indicated in Table 2. 
^ The results of Example 2 A are shown in Table 2A. Table 2 A shows 
€the percentage of bivalent and other multivalent forms before and after 
treatment with 20% ethanol and 0.5M GuHCL Unless otherwise indicated, 
percentages were determined using a automatic data integration software 
package. 
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Table 2A 

Summary of the generation of bivalent and higher 

multivalent forms of B6.2/212 and CC49/212 
proteins using guanidine hydrochloride and ethanol 



protein 



Time 


Concentration 




% 






(hours) 


(mg/ml) 


moaomer 


dimer 


trimcr 


multimers 


0 


0,25 


86.7 


11.6 


1.7 


0.0 


0 


1.0* 


84.0 


10.6 


5.5 


0.0 


0 


4.0 


70.0 


17.1 


12.9* 


0.0 


2 


0,25* 


62.9 


33.2 


4.2 


0.0 


2 


1.0 


24.2 


70.6 


5.1 


0.0 


2 


4.0 


9.3 


81.3 


9.5 


0.0 


26 


0.25 


16.0 


77.6 


6.4 


0.0 


26 


1.0 


9.2 


82.8 


7.9 


0.0 


26 


4.0 


3.7 


78.2 


18.1 


0.0 


0 


0.25 


100.0 


0.0 


0.0 


0.0 


0 


1.0 


100.0 


0.0 


0.0 


0.0 


0 


4.0 


100.0 


0.0 


0.0 


0.0 


2 


0.25* 


98.1 


1.9 


0.0 


0.0 


2 


1.0 


100.0 


0.0 


0.0 


0.0 


2 


4.0 


90.0 


5.5 


IJO 


0.0 


24 


0.25 


45.6 


37.5 


lOJl 


6.7 


24 


1.0 


50.8 


2K4 


12.3 


15.0 


24 


4.0 


5.9 


37.2 


25.7 


29.9 



CC49/212 



B6.2/2I2 



* Based on cut out peaks Uiat were weighted. 

* Average of two experiments. 



B. Process Using Urea and Ethanol 



Multivalent antigen-binding proteins were produced from purified 
single-chain proteins in the following way. First the purified single-chain 
protein at a concentration of 0.25-1 mg/ml was dialyzed against 2M urea, 20% 
ethanol (EtOH), and 50mM Tris buffer pH 8.0, for. the times indicated in 
Table 2B. This combination of dissociating agents is thought to disrupt the 
Vl/Vh interface, allowing the Vh of a first single-chain molecule to come into 
contact with a Vl from a second single-chain molecule. Other dissociating 
agents such as isopropanol or methanol should be substitutable for EtOH. 
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Following the initial dialysis, the protein was dialyzed against the load buffer 
for the final HPLC purification step. 

Gel filtration HPLC chromatography (TSK G2000SW column from 
Toyo Soda, Tokyo, Japan) was used to identify and separate monomeric 
single-chain and multivalent antigen-binding proteins. This procedure has 
been described by Fukano, et aL, J. Chromatography 166:47 (1978), 

The results of Example 2B are shown in Table 2B. Table 2B shows 
the percentage of bivalent and other multivalent forms before and after 
treatment with 20% ethanol and urea. Percentages were determined using an 
automatic data integration software package. 

Table 2B 



Summary of the generation of bivalent and higher 
multivalent forms of 
B6.2/212 and CC49/212 proteins using urea and ethanol 



protein 


Time 
(hours) 


Conceniration 
(mg/ml) 


monoiner 


% 

diiner 


trimcr 


multimcrs 


B6.2 


O 


0.25 


44.1 


37.6 


15.9 


2.4 




0 


1.0 


37.7 


. 33.7 


19.4 


9.4 




3 


0.25 


22.2 


66.5 


11.3 


0.0 




3 


I.O 


13.7 


69.9 


16.4 


0.0 



Example 3 

Determination of Binding Constants 

Three anti-fluorescein single-chain antigen-binding proteins have been 
constructed based on the anti-fluorescein monoclonal antibody 4-4-20. The 
three 4-4-20 single-chain antigen-binding proteins differ in the polypeptide 
linker connecting the Vh and regions of the protein. The three linkers used 
were 202\ 212 and 216 (see Table 3). Bivalent and higher forms of the 4-4- 
20 antigen-binding protein were produced by concentrating the purified 
monomeric single-chain antigen-binding protein in the cation exchange load 
buffer (0.06 M MOPS, 0.001 M calcium acetate pH 6.4) to 5 mg/ml. The 
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bivalent and monomelic forms of the 4-4-20 antigen-binding proteins were 
separated by cation exchange HPLC (polyaspartate column) using a 50 min. 
linear gradient between the load buffer (0.06 M MOPS, 0.001 M calcium 
acetate pH 6.4) and a second buffer (0.06 M MOPS, 0.02 M calcium acetate 
5 pH7.5). Two 0.02 ml samples were separated, and fractions of the bivalent 

and monomeric protein peaks were collected on each run. The amount of 
protein contained in each fraction was determined from the absorbance at 278 
ran from the first separation. Before collecting the fractions from the second 
separation run, each fraction tube had a suffldent quantity of 1.03 x 10* M 

10 fluorescein added to it, such that after tiie fractions were collected a 1-to-I 

molar ratio of protein-to-fluorescein existed. Addition of fluorescein stabilized 
the bivalent form of the 4-4-20 antigen-binding proteins. These samples were 
kept at 2'C (on ice). 

The fluorescein dissociation rates were determined for each of these 

15 samples following die procedures described by Herron, J.N., in Fluorescence 

Hapten: An Immunological Probe, E.W. Voss, Ed., CRC Press, Boca Raton, 
FL (1984). A sample was first dUuted widi 20 mM HEPES buffer pH 8.0 to 
5.0 X 10^ M 4-4-20 antigen-binding protein. 560 ^tl of the 5.0 x 10^ M 4-4- 
20 antigen-binding protein sample was added to a cuvette in a fluorescence 

20 spectrophotometer equilibrated at 2'*C and the fluorescence was read. 140 /tl 

of 1.02 X 10-^ M fluoresceinamine was added to the cuvette, and the 
fluorescence was read every 1 minute for up to 25 minutes (see Table 4). 

The binding constants (K J for die 4-4-20 single-ch^n antigen-binding 
protein monomers diluted in 20 mM HEPES buffer pH 8.0 in tiie absence of 

25 fluorescein were also determined (see Table 4). 

The three polypeptide linkers in these experiments differ in length. 
The 202', 212 and 216 linkers are 12, 14 and 18 residues long, respectively. 
These experiments show tiiat tiiere are two effects of linker length on tiie 4-4- 
20 antigen-binding proteins: first, flic shorter the linker lengtix the higher the 

30 fraction of bivalent protein formed; second, flie fluorescein dissociation rates 

of die monomeric single-chain antigen-binding proteins are effected more by 
tiie linker length tiian are the dissociation rates of die bivalent antigen-binding 
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proteins. With the shoner linkers 202' and 212, the bivalent antigen-binding 
proteins have slow^er dissociation rates than the monomers. Thus, the linkers 
providing optimum production and binding affinities for monomeric and 
bivalent antigen-binding proteins may be different. Longer linkers may be 
more suitable for monomeric single-chain antigen-binding proteins, and shorter 
linkers may be more suitable for multivalent antigen-binding proteins. 



Table 3 
Linker Designs 




Linker 


v„ 


Linker 
Name 


Reference 


-KZiEIE 


GKSSGSGSESICS^ 


TQKLD- 


202' 


Bird et al. 


-KLEIK 


GSTSGSGKSSEGKG^ 


EVKLD- 


212 


Bedzyk & aL 


-KZiEIK 


GSTSGSGKSSEGSGSTKG' 


EVKliD- 


216 


This application 


-KLVLK 


GSTSGKPSEGKG^ 


EVKIiD- 


217 


This application 



(1) iiHQ ID NO. 1 

(2) SEQ ID NO. 2 

(3) SEQ ID NO. 3 

(4) SEQ ID NO. 4 





Table 4 






Effects of Linkers on tiie SCA Protein Monomers and Dimers 






Linker 






202' 


212 


216 


Monomer 
Fraction 
Ka 

Dissociation rate 


0.47 
0.5 X 10* M ' 
8.2 X 10'^ s"» 


0.66 
1.0 X 10* M ' 
4.9 X la^ s ' 


0.90 
1.3 X 10* M * 
3.3 X 10*' s ' 


Dimer 
Fraction 
Dissociation rate 


0.53 
4.6 X 10-^ s ' 


0.34 
3.5 X 10-' s ' 


0.10 
3.5 X 10*3 s » 


Monomer/Dimer 
Dissociation rate ratio 


1.8 


1.4 


0.9 



Example 4 
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bivalent and monomelic forms of tfie 4-4-20 antigen-binding proteins were 
separated by cation exchange HPLC (polyaspartate column) using a 50 min. 
linear gradient between tfie load buffer (0.06 M MOPS, 0.001 M calcium 
acetate pH 6.4) and a second buffer (0.06 M MOPS, 0.02 M calcium acetate 
pH 7.5). Two 0.02 ml samples were separated, and fractions of the bivalent 
and monomeric protein peaks were collected on each run. The amount of 
protein contained in each fraction was determined from the absorbance at 278 
nm from the first separation. Before collecting the fractions from the second 
separation run, each fraction tube had a sufficient quantity of 1.03 x 10* M 
fluorescein added to it, such that after the fractions were collected a 1-to-l 
molar ratio of protein-to-fluorescein existed. Addition of fluorescein stabilized 
the bivalent form of the 4-4-20 antigen-binding proteins. These samples were 
kept at 2**C (on ice). 

The fluorescein dissociation rates were determined for each of these 
samples following the procedures described by Herron, J.N., \n Fluorescence 
Hapten: An Immunological Probe, E.W. Voss, Ed., CRC Press, Boca Raton, 
FL (1984). A sample was first diluted with 20 mM HEPES buffer pH 8.0 to 
5.0 X 10"* M 4-4-20 antigen-binding protein. 560 il\ of the 5.0 x lO"* M 4-4- 
20 antigen-binding protein sample was added to a cuvette in a fluorescence 
spectrophotometer equilibrated at 2°C and the fluorescence was read. 140 ii\ 
of 1.02 X 10"^ M fluoresceinamine was added to the cuvette, and the 
fluorescence was read every 1 minute for up to 25 minutes (see Table 4). 

The binding constants (KJ for the 4-4-20 single-chain antigen-binding 
protein monomers diluted in 20 mM HEPES buffer pH 8.0 in the absence of 
fluorescein were also determined (see Table 4). 

The three polypeptide linkers in these experiments differ in length. 
The 202', 212 and 216 linkers are 12, 14 and 18 residues long, respectively. 
These experiments show that there are two effects of linker length on the 4-4- 
20 antigen-binding proteins: first, the shorter the linker length the higher the 
fraction of bivalent protein formed; second, the fluorescein dissociation rates 
of the monomeric single-chain antigen-binding proteins are effected more by 
the linker length than are the dissociation rates of the bivalent antigen-binding 
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proteins. With the shorter linkers 202' and 212, the bivalent antigen-binding 
proteins have slower dissociation rates than the monomers. Thus, the linkers 
providing optimum production and binding affinities for monomeric and 
bivalent antigen-binding proteins may be different. Longer linkers may be 
5 more suitable for monomeric single-chain antigen-binding proteins, and shorter 

linkers may be more suitable for multivalent antigen-binding proteins. 



Table 3 
Linker Designs 




Linker 


v„ 


Linker 
Name 


Reference 


-KLEIE 


GKSSGSGSESKS' 


TQKLD- 


202' 


Bird et al. 


-KLEIK 


GSTSGSGKSSEGKG' 


EVKLD- 


212 


Bedzyk et at. 


-KLEIK 


GSTSGSGKSS£GSGSTK6* 


EVKLD- 


216 


This appiicacion 


-KLVLK 


GSTSGKBSBGKG* 


EVKLD* 


217 


This application || 



(1) seo id no. 1 

(2) SEQ ID NO. 2 

(3) SEQ ID NO. 3 

(4) SEQ ID NO. 4 





Table 4 






Effects of Linkers on the SCA Protein Monomers and Dimers 






Linker 






202' 


212 


216 


Monomer 
Fraction 
Ka 

Dissociation rate 


0.47 
0.5 X 10^ M-» 
8.2 X lO^ s * 


0.66 
1.0 X 10^ M ' 
4.9 X 10"^ s * 


0.90 
1.3 X 10' M ' 
3.3 X 10-^ s ' 


Dimer 
Fraction 
Dissociation rate 


0.53 
4.6 X 10^ s » 


0.34 
3.5 X 10-^ s*' 


0.10 
3.5 X 10-3 s » 


Monomer/Dimer 
Dissociation rate ratio 


L8 


1.4 


0.9 



Example 4 
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Genetic Construction of a Mixed-Fragment Bivalent Antigen- 
Binding Protein 



The genetic constructions for one particular heterobivalent antigen- 
binding protein according to the Rearrangement model are shown in Figures 
lOA and lOB. Figure 1 OA is an amino acid and nucleotide sequence listing 
of the 4-4-20 VJ212ICCA9 Vh construct, coding for a single-chain protein 
wifli a 4-4-20 V^, linked via a 212 polypeptide linker to a CC49 Figure 
lOB is a similar listing showing the CC49 Vl/212/4-4-20 Vh construct, coding 
for a single-chain protein with a CC49 V^, linked via a 212 linker to a 4-4-20 
Vh- These single-chain proteins may recombine according to the 
Rearrangement model to generate a heterobivalent protein comprising a CC49 
antigen-binding site linked to a 4-4-20 antigen-binding site, as shown in Figure 
5B. 

"4-4-20 Vl" means the variable region of the light chain of the 4-4-20 
mouse monoclonal antibody (Bird, R.E. etaL, Science 242:423 (1988)). The 
number "212" refers to a specific 14-residue polypeptide linker that links the 
4-4-20 Vl and tiie CC49 V„. See Bedzyk^ W.D. et aL, J. Biol. Chem. 
265:18615-18620 (1990). "CC49 V„" is tiie variable region of flie heavy 
chain of the CC49 antibody, which binds to the TAG-72 antigen. The CC49 
antibody was developed at The National Institutes of Health by Schlom, et al. 
Generation and CharacteriTjation of B72.3 Second Generation Monoclonal 
Antibodies Reactive With The Tumor-assodated Glycoprotein 72 Antigen, 
Cancer Research 48:4588-4596 (1988). 

Insertion of the sequences shown in FIGS. lOA and lOB, by standard 
recombinant DNA methodology, into a suitable plasmid vector will enable one 
of ordinary skill in the art to transform a suitable host for subsequent 
expression of the single-chain proteins. See Maniatis et al.. Molecular 
Cloning, A Laboratory Manual, p. 104, Cold Spring Harbor Laboratory 
(1982), for general recombinant techniques for accomplishing die aforesaid 
goals; see also U.S. Patent 4,946,778 (Ladner et aL) for a complete 
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description of methods of producing single-chain protein molecules by 
recombinant DNA technology. 

To produce multivalent antigen-binding proteins from the two single- 
chain proteins, 4~4-20Vl-212/CC49Vh and CC49Vl/212/4-4-20Vh, the two 
5 single-chain proteins are dialyzed into 0.5 M GuHCl/20% EtOH being 

combined in a single solution either before or after dialysis. The multivalent 
proteins are then produced and separated as described in Example 2. 

Example 5 

K ^ Preparation of Multivalent 

10 Antigen-Binding Proteins by Chemical Cross-Linking 

Free cysteines were engineered into the C-terminal of the 4-4-20/212 
single-chain antigen-binding protein, in order to chemically crosslink the 
protein. The design was based on the hinge region found in antibodies 
between the ChI and Ch2 regions. In order to try to reduce antigenicity in 
humans, the hinge sequence of the most common IgG class, IgGl, was 
chosen. The 4-4-20 Fab structure was examined and it was determined that 
the C-termihal sequence GluH216-ProH217-ArgH218, was part of the 
region and that the hinge between ChI and Ch2 starts with ArgH218 or 
GlyH219 in the mouse 4-4-20 IgG2A antibody. Figure 14 shows the structure 
of a human IgG. The hinge region is indicated generally. Thus the hinge 
from human IgGl would start with LysH218 or SerH219. (See Table 5). 

The C-terminal residue in most of the single-chain antigen-binding 
proteins described to date is the amino acid serine. In the design for the hinge 
region, the C-terminal serine in the 4-4-20/212 single-chain antigen-binding 
protein was made the first serine of the hinge and the second residue of the 
hinge was changed from a cysteine to a serine. This hinge cysteine normally 
forms a disulfide bridge to the C-terminal cysteine in the light chain. 



15 



i } 



20 



25 
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T ABLE 5 



218 



•SCA*' 
SCA* Hinge design 1* 
SCA* Hinge design 2^ 



IgG2A mouse^ 
IgGl human^ 



EPRGPTIKP CPPCIiC- 
AEPK SCDKTHTCPPC- 

- - V T V S 

- - VTVSSDKTHTC 

- - VTVSSDKTHTCPPC 



* - single-chain antigen-binding protein 

(1) SEQ ID NO. 5 

(2) SEQ ID NO. 6 

(3) SEQ ID NO. 7 

(4) SEQ ID NO. 8 

(5) SEQ ID NO. 9 

There are possible advantages to having two C-terminal cysteines, for 
they might form an intramolecular disulfide bond, making the protein recovery 
easier by protecting the sulfurs from oxidation. The hinge regions were added 
by introduction of a BstE II restriction site in the 3 '-terminus of the gene 
encoding the 4-4-20/212 single-chain antigen-binding protein (see Figures 15A- 
15B). 

The monomeric single-chain antigen-binding protein containing the C- 
terminal cysteine can be purified using the normal methods of purifying a 
single-chain antigen-binding proteins, with minor modifications to protect the 
free sulfhydryls. The cross-linking could be accomplished in one of two 
ways. First, the purified single-chain antigen-binding protein could be treated 
witii a mild reducing agent, such as dithiothreitol, then allowed to air oxidize 
xo form a disulfide-bond between the individual single-chain antigen-binding 
proteins. This type of chemistry has been successful in producing 
heterodimers from whole antibodies (Nisonoff aL, Quantitative Estimation 
of the Hybridization of Rabbit Antibodies, Nature 4826:355-359 (1962); 
Brennan et al.. Preparation of Bispecific Antibodies by Chemical 
Recombination of Monoclonal Immunoglobulin G, Fragments, Science 229:Zl- 
83 (1985)). Second, chemical crosslinking agents such as tomaleimidehexane 
could be used to cross-link two single-chain antigen-binding proteins by their 
C-terminal cysteines. See Partis et al, J. Prot, Chem. 2:263-277 (1983). 
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Example 6 

Genetic Construction of Bivalent Antigen-Binding Proteins 

Bivalent antigen-binding proteins can be constructed genetically and 
subsequently expressed in E. coll or other known expression systems. This 
5 can be accomplished by genetically removing the stop codons at the end of a 

/cnc encoding a monomeric single-chain antigen-binding protein and inserting 
3 linker ard a gene encoding a second single-chain antigen-binding protein. 
f ^^'c have constructed a gene for a bivalent CC49/212 antigen-binding protein 

:n this manner (see Figure 16). The CC49/212 gene in the starting expression 
10 plasmid is in an Aat II to Bam HI restriction fragment (see Bird et al., Single- 

. Chain Antigen-Binding Proteins, Science 242:423-426 (1988); and Whitlow 
ri al.. Single-Chain Fy Proteins and Their Fusion Proteins, Methods 2:97-105 
(1991)). The two stop codons and the Bam HI site at the C-terminal end of 
the CC49/212 antigen-binding protein gene were replaced by a single residue 
15 V linker (Ser) and an Aat II restriction site. The resulting plasmid was cut with 

^ Aat II and the purified Aat II to Aat II restriction fragment was ligated into 
Aai II cut CC49/212 single-chain antigen-binding protein expression plasmid. 
The resulting bivalent CC49/212 single-chain antigen-binding protein 
( } expression plasmid was transfected into an E. coli expression host that 

20 contained the gene for the cI857 temperature-sensitive repressor. Expression 

of single-chain antigen-binding protein in this system is induced by raising the 
temperature from SO^'C to 42'*C. Fig. 17 shows the expression of the divalent 
CC49/212 single-chain antigen-binding protein of Fig. 16 at 42 ^'C, on an SDS- 
PAGE gel containing total E. coli protein. Lane 1 contains the molecular 
25 weight standards. Lane 2 is the uninduced E. coli production strain grown at 

30°C. Lane 3 is divalent CC49/212 single-chain antigen-binding protein 
induced by growth at 42*'C. The arrow shows the band of expressed divalent 
CC49/212 single-chain antigen-binding protein. 
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Example 7 



Construction^ Purification, and Testing of4^20/CC49 
Heterodimer Fy With 217 Linkers. 

The goals of this experiment were to produce, purify and analyze for 
activity a new heterodimer Fv that would bind to both fluorescein and fhe pan- 
carcinoma antigen TAG-72. The design consisted of two polypeptide chains, 
which associated to form the active heterodimer Fv. Each polypeptide chain 
can be described as a mixed single-chain Fv (mixed sFv). The first mixed sFv 
(GX 8952) comprised a 4-4-20 variable light chain (V^) and a CC-49 variable 
heavy chain (V„) connected by a 217 polypeptide linker (Figure 19A). The 
second mixed sFv (GX 8953) comprised a CC-49 and a 4-4-20 V„ 
connected by a 217 polypeptide linker (Figure 19B). The sequence of the 217 
polypeptide linker is shown in Table 3. Construction of aiialogous CC49/4-4- 
20 heterodimers connected by a 212 polypeptide linker as described in 
Example 4. 



A. Purification 

One 10-Iiter fermentation of each mixed sFv was grown on casein 
digest-glucose-salts medium at 32''C to an optical density at 600 nm of 15 to 
20. The mixed sFv expression was induced by raising the temperature of the 
fermentation to 42*'C for one hour. 277gm (wet cell weight) of E. coU strain 
GX 8952 and 233gm (wet cell weight) of E. coli strain GX 8953 were 
harvested in a centrifuge at 7000g for 10 minutes. The cell pellets were kept 
and die supemate discarded. The cell pellets were frozen at -20*'OC for 
storage. 



Results 



wo 93/1 1161 



PCr/US92/09965 



-39- 



2.55 liters of "lysis/wash buffer" (SOmM Tris/ 200mM NaCl/ 1 mM 
EDTA, pH 8.0) was added to both of the mixed sFv's cell pellets, which were 
previously thawed and combined to give 5 lOgm of total wet cell weight. After 
complete suspension of the cells they were then passed through a Gaulin 
horaogenizer at 9000psi and 4''C. After this first pass the temperature 
increased to 23**C. The temperature was immediately brought down to 0°C 
using dry ice and methanol. The cell suspension was passed through the 
Gaulin homogenizer a second time and centrifuged at 8000 ipm with a Dupont 
GS-3 rotor for 60 minutes. The supernatant was discarded after centrifugation 
and the pellets resuspended in 2.5 liters of "lysis/wash buffer" at 4°C. This 
suspension was centrifuged for 45 minutes at 8000 rpm with the Dupont GS-3 
rotor. The supernatant was again discarded and the pellet weighed. The 
pellet weight was 136.1 gm. 

1300ml of 6M GuanidineHydrochloride/SOmM Tris/50mM KCl/lOmM 
CaCUpH 8.0 at 4°C was added to the washed pellet. An overhead mixer was 
used to speed solubilization. After one hour of mixing, the heterodimer 
GuHCl extract was centrifuged for 45 minutes at 8000 rpm and the pellet was 
discarded. The 1425ml of heterodimer Fv 6M GuHCl extract was slowly 
added (16 ml/min) to 14.1 liters of "Refold Buffer" (50mM Tris/50mM 
KCl/lOmM CaClj, pH 8.0) under constant mixing at 4'*C to give an 
approximate dilution of 1:10. Refolding took place overnight at 4*'C. 

After 17 hours of refolding the anti-fluorescein activity was checked by 
a 40% quenching assay, and the amount of active protein calculated. 150mg 
total active heterodimer Fv was found by the 40% quench assay, assuming a 
54,000 molecular weight. 

4 liters of prechilled (4''C) 190 proof ethanol was added to the 15 liters 
of refolded heterodimer with mixing for 3 hours. The mixture sat overnight 
at 4''C. A flocculent precipitate had settled to the bottom after this overnight 
treatment. The nearly clear solution was filtered through a Millipak-200 
(0.22|i) filter so as to not disturb the precipitate. A 40% quench assay 
showed that 10% of the anti-fluorescein activity was recovered in the filtrate. 
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The filtered sample of heterodimer was dialyzed, using a Pellicon 
system containing 10,000 dalton MWCO membranes, with "dialysis buffer" 
40mM MOPS/0.5mM Calcium Acetate (CaAc), pH 6.4 at 4"C. 20 liters of 
dialysis buffer was required before the conductivity of the retentate was equal 
5 to that of the dialysis buffer (~500;tS). After dialysis the heterodimer sample 

was filtered through a Millipak-20 filter, 0.22/t. After this step a 40% quench 
assay showed there was 8.8 mg of active protein. 

The crude heterodimer sample was loaded on a Poly CAT A cation 
exchange column at 20ml/min. The column was previously equilibrated with 
10 60mM MOPS. 1 mM CaAc pH 6.4. at 4-C, (Buffer A). After loading, the 

column was washed with 150ml of "Buffer A" at 15ml/min. A 50min linear 
gradient was performed at 15ml/min using "Buffer A" and "Buffer B" (60mM 
MOPS, 20mM CaAc pH 7.5 at 4"Q. The gradient conditions are presented 
in Table 6. "Buffer C" comprises 60mM MOPS, lOOmM CaClj, pH 7.5. 



20 



Table 6 


Time 


%A 


%B 


%c 


Flow 


0:00 


100.0 


0.0 


0.0 


ISml/min 


50:00 


0.0 


100.0 


0.0 


15ml/min 


52:00 


0.0 


100.0 


0.0 


ISml/min 


54:00 


0.0 


0.0 


100.0 


15ml/min 


58:00 


0.0 


0.0 


100.0 


15ml/min 


60:00 


100.0 


0.0 


0.0 


15ml/min 



Approximately 50ml fractions were collected and analyzed for activity, 
puriQr, and molecular weight by size-exclusion chromatography. The fractions 
25 were not collected by peaks, so contamination between peaks is likely. 

Fractions 3 through 7 were pooled (total volume - 218ml), concentrated to 
50ml and dialyzed against 4 liters of 60mM MOPS, 0.5mM CaAc pH 6.4 at 
4''C overnight. The dialyzed pool was filtered through a 0-22/t filter and 
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checked for absorbance at 280nni. The filtrate was loaded onto the PolyCAT 
A column, equilibrated with 60mM MOPS, 1 mM CaAc pH 6.4 at 4'^C, at a 
flow rate of lOml/min. Buffer B was changed to 60mM MOPS, lOmM CaAc 
pH 7.5 at 4'*C. The gradient was run as in Table 6. The fractions were 
5 collected by peak and analyzed for activity, purity, and molecular weight. 

The chromatogram is shown in Figure 20. Fraction identification and analysis 
is presented in Table 7. 



Table 7 

Fraction Analysis of the Heterodimer Fv protein 


Fraction 
No. 


Ajgo reading 


Total Volume 
(ml) 


HPLC-SE Elution Time 
(min) 


2 


0.161 


36 


20.525 


3 


0.067 


40 




4 


0.033 


40 




5 


0.178 


45 


19.133 


6 


0.234 


50 


19.163 


7 


0.069 


50 




8 


0.055 


40 





Fractions 2 to 7 and the starting material were analyzed by SDS gel 
/ j 20 electrophoresis, 4-20%. A picture and description of the gel is presented in 

Figure 21. 

B. HPLC Size Exclusion Results 

Fractions 2, 5, and 6 correspond to the three main peaks in Figure 20 
and therefore were chosen to be analyzed by HPLC size exclusion. Fraction 
25 2 corresponds to the peak that runs at 21.775 minutes in the preparative 

purification (Figure 20), and runs on the HPLC sizing column at 20.525 
minutes, which is in the monomeric position (Figure 22 A). Fractions 5 and 
6 (30.1 and 33.455 minutes, respectively, in Figure 20) run on the HPLC 
sizing column (Figures 22B and 22C) at 19.133 and 19.163 minutes. 
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respectively (see Table 7). Therefore, both of these peaks could be considered 
dimers. 40% Quenching assay;s were performed on all fractions of this 
purification. Only fraction 5 gave significant activity. 2.4 mg of active CC49 
4-4-20 heterodimer Fv was recovered in fraction 5, based on the Scatchard 
analysis described below. 

C. N-terminal sequencing of the fractions 

The active heterodimer Fv fraction should contain both polypeptide 
chains. N-terminal sequence analysis showed that fractions 5 and 6 displayed 
N-terminal sequences consistent with the prescence of both CC49 and 4-4-20 
polypeptides and fraction 2 displayed a single sequence corresponding xo the 
CC49/212/4-4-20 polypeptide only. We believe fliat fraction 6 was 
contaminated by ftaction 5 (see Figure 20), since only fraction 5 had 
significant activity. 

Z). Anti'fluorescein activity by Scatchard analysis 
The fluorescein association constants (K?) were determined for 
fractions 5 and 6 using the fluorescence quenching assay described by Herron, 
J.N., in Fluorescence Hapten: An Immunological Probe, E.W. Voss, ed., 
CRC Press, Boca Raton, FL (1984). Each sample was diluted to 
approximately 5.0 x 10^ M witii 20 mM HEPES buffer pH 8.0. 590 fil of the 
5.0 X 10*^ M sample was added to a cuvette in a fluorescence 
spectrophotometer equilibrated at room temperature. In a second cuvette 590 
;il of 20 mM HEPES buffer pH 8.0 was added. To each cuvette was added 
10 fil of 3.0 X 10-^ M fluorescein in 20 mM HEPES buffer pH 8.0, and the 
fluorescence recorded. This is repeated until 140 fil of fluorescein had been 
added. The resulting Scatchard analysis for fraction 5 shows a binding 
constant of L 16 x 10^ M*' for fraction #5 (see Figure 23). This is very close 
to the 44-20/212 sFv constant of 1.1 x 10^ M'^ (see Pantoliano et aL, 
Biochemistry 50:10117-10125 (1991)). The R intercept on the Scatchard 
analysis represents the fraction of active material. For fraction 5, 61 % of the 
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material was active. The graph of the Scatchard analysis on fraction 6 shows 
a binding constant of 3.3 x 10^ M'^ and 14% active. The activity that is 
present in fraction 6 is most likely contaminants from fraction 5. 

E. Anti'TAG-72 activity by competition EUSA 

The CC49 monoclonal antibody was developed by Dr. Jeffrey Schlom ' s 
group. Laboratory of Tumor Immunology and Biology, National Cancer 
Institute. It binds specifically to the pan-carcinoma tumor antigen TAG-72. 
See Muraro, R., et al.. Cancer Research -<S:4588-4596 (1988). 

To determine the binding properties of the bivalent CC49/4-4-20 Fv 
(fraction 5) and the CC49/212 sFv, a competition enzyme-linked 
immunosorbent assay (ELISA) was set up in which a CC49 IgG labeled with 
biotin was competed against unlabeled CC49/4-4-20 Fv and the CG49/212 sFv 
for binding to TAG-72 on a human breast carcinoma extract (see Figure 24). 
The amount of biotin-labeled CC49 IgG was determined using a preformed 
complex with avidin and biotin coupled to horse radish peroxidase and O- 
^phenylenediamine dihydrochloride (OPD). The reaction was stopped with 4N 
H2SO4 (sulfuric acid), after 10 min. and the optical density read at 490nm. 
This competition ELISA showed that the bivalent CC49/4-4-20 Fv binds to the 
TAG-72 antigen. The CC49/4-4-20 Fv needed a two hundred-fold higher 
protein concentration to displace the IgG than the single-chain Fv. 

Example 8 
Cross-Linking Antigen-Binding Dimers 

We have chemically crosslinked dimers of 4-4-20/212 antigen-binding 
protein with the two cysteine C-terminal extension (4-4-20/212 CPPC single- 
chain antigen-binding protein) in two ways. In Example 5 we describe the 
design and genetic construction of the 4-4-20/212 CPPC single-chain antigen- 
binding protein (hinge design 2 in Table 5). Figure 15B shows the nucleic 
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acid and protein sequences of this protein. After purifying the 4-4-20/212 
CPPC single-chain antigen-binding protein, using the methods described in 
Whitlow and Filpula, Math. EnzymoL 2:97 (1991), dimers were formed by 
two methods. First, the free cysteines were mildly reduced with dithiothreitol 
(DIT) and then the disulfide-bonds between the two molecules were allowed 
to form by air oxidation. Second, the chemical crosslinker bis- 
maleimidehexane was used to produce dimers by crosslinking the free 
cysteines from two 4-4-20/212 CPPC single-chain antigen-binding proteins. 

A O.I mg/ml solution of the 4-4-20/212 CPPC single-chain antigen- 
binding protein was mildly reduced using 1 mM DTT, 50 mM HEPES, 50mM 
NaCl, 1 mM EDTA buffer pH 8.0 at 4''C. The samples were dialyzed against 
50mM HEPES, 50 mM NaCl, 1 mM EDTA buffer pH 8.0 at 4**C overnight, 
to allow the oxidation of free sulfhydrals to intermolecular disulfide-bonds. 
Figure 25 shows a non-reducing SDS-PAGE gel after the air oxidation; it 
shows that approximately 10% of the 4-4-20/212 CPPC protein formed dimers 
with m jlecular weights around 55,000 Daltons. 

A 0.1 mg/ml solution of the 4-4-20/212 CPPC angle-chain antigen- 
binding protein was treated with 2 mM ^ur-maleimidehexane. Unlike forming 
a disulfide-bond between two free cysteines in the previous example, the bis- 
maleimidehexane crosslinker material should be stable to reducing agents such 
as /3-mercaptoethanol. Figure 26 shows that approximately 5 % of the treated 
material produced dimer with a molecular weight of 55,000 Daltons on a 
reducing SDS-PAGE gel (samples were treated with iS-mercaptalethanol prior 
to being loaded on the gel). We further purified the ^/^-maleimidehexane 
treated 4-4-20/212 CPPC protein on PolyCAT A cation exchange column after 
the protein had been extensively dialyzed against buffer A. Figure 26 shows 
that we were able to enhance the fraction containing the dimer to 
approximately 15%. 
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Multivalent Antigen-Binding Proteins 



This invention was made with Government Support under SBIR Grant 
5R44 GM 39662-03 awarded by the National Institutes of Health, National 
Institute of General Medical Sciences. The Government has certain rights in 
the invention. 



This application is a continuation-in-part of U.S. Patent Application 
Serial Number 07/796,936, filed Nov, 25, 1991, which is a continuation-in- 
part of U.S. Patent Application Serial No. 07/512,910 filed April 25, 1990, 
which is a continuation-in-part of Serial No. 07/299,617, filed Jan. 1, 1989, 
issued as U.S. Patent No. 4,946,778 (Ladner et aL), which was a 
continuation-in-part of Serial No. 092,110, filed Sept. 2, 1987, and Serial No. 
902,971, filed Sept. 2, 1986, now abandoned, the contents of all of which are 
fully incorporated herein by reference. 



i. Field of the Invention 

The present invention relates generally to the production of antigen- 
binding molecules. More specifically, the invention relates to multivalent 
forms of antigen-binding proteins. Compositions of, genetic constructions for, 
methods of use, and methods for producing these multivalent antigen-binding 
proteins are disclosed. 



Cross-Reference to Related Applications 



Background of the Invention 
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2. Description of the Background Art 

Antibodies are proteins generated by the immune system to provide a 
specific molecule capable of complexing with an invading molecule, termed 
an antigen. Rgure 14 shows the structure of a typical antibody molecule. 
Natural antibodies have two identical antigen-binding sites, both of which are 
specific to a particular antigen. The antibody molecule "recognizes" the 
antigen by complexing its antigen-binding sites with areas of the antigen 
termed epitopes. The epitopes fit into the conformational architecture of die 
amigen-bindtng sites of the antibody, enabling die antibody to bind to the 
antigen. 

The antibody molecule is composed of two identical heavy and two 
identical light polypeptide chains, held together by interchain disulfide bonds 
(sec Fig. 14). The remainder of this discussion will refer only to one 
light/heavy pair of chains, as each light/heavy pair is identical. Each 
individual light and heavy chain folds into regions of approximately 110 amino 
adds, assuming a conserved three-dimensional conformation. The light chain 
comprises one variable region (termed VJ and one constant region (CJ, while 
the heavy chain comprises one variable region (Vh) and three constant regions 
(C„ 1 , C„2 and Ch3). Pairs of regions associate to form discrete structures as 
shown in Figure 14. In particular, the light and heavy chain variable regions, 
Vl and VH,associate to form an "Fy" area which contains the antigen-binding 
site. 

The variable regions of both heavy and light chains show considerable 
variability in structure and amino acid composition from one antibody 
molecule to another, whereas the constant regions show little variability. The 
term "variable" as used in this specification refers to die diverse nature of die 
amino acid sequences of the antibody heavy and light chain variable regions. 
Each antibody recognizes and binds antigen tiirough tiie binding site defined 
by die association of the heavy and light chain variable regions into an Fy 
area. The light-chain variable region Vl and die heavy-chain variable region 
V„ of a particular antibody molecule have specific amino acid sequences tiiat 
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allow the antigen-binding site to assume a conformation that binds to the 
antigen epitope recognized by that particular antibody. 

Within the variable regions are found regions in which the amino acid 
sequence is extremely variable from one antibody to another. Three of these 
so-called "hypervariable" regions or "complementarity-determining regions" 
(CDR*s) are found in each of the light and heavy chains. The three CDR's 
from a light chain and the three CDR's from a corresponding heavy chain 
form the antigen-binding site. 

Cleavage of the naturally-occurring antibody molecule with the 
proteolytic enzyme papain generates fragments which retain their antigen- 
binding site. These fragments, commonly known as Fab's (for Fragment, 
antigen binding site) are composed of the C^, Vl, C„1 and V„ regions of the 
antibody. In the Fab the light chain and the fragment of the heavy chain are 
covalently linked by a disulfide linkage. 

Recent advances in immunobiology, recombinant DNA technology, and 
computer science have allowed the creation of single polypeptide chain 
molecules that bind antigen. These single-chain antigen-binding molecules 
incorporate a linker polypeptide to bridge the individual variable regions, 
and Vh, into a single polypeptide chain. A computer-assisted method for 
linker design is described more particulariy in U.S. Patent No. 4,704,692, 
issued to Ladner et aL in November, 1987, and incorporated herein by 
reference. A description of the theory and production of single-chain antigen- 
binding proteins is found in U.S. Patent No. 4,946,778 (Ladner et aL), issued 
August 7, 1990, and incorporated herein by reference. The single-chain 
antigen-binding proteins produced under the process recited in U.S. Patent 
4,946,778 have binding specificity and affinity substantially similar to that of 
the corresponding Fab fragment. 

Bifiinctional, or bispecific, antibodies have antigen binding sites of 
different specificities. Bispecific antibodies have been generated to deliver 
cells, cytotoxins, or drugs to specific sites. An important use has been to 
deliver host cytotoxic cells, such as natural killer or cytotoxic T cells, to 
specific cellular targets. (U.D. Staerz, O. Kanagawa, M.J. Sevan, Nature 
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314:628 (1985); S. Songilvilai, P.J. Lachmann, Clin, Exp, ImmunoL 79: 315 
(1990)). Another important use has been to deliver cytotoxic proteins to 
specific cellular targets. (V. Raso, T. Griffin, Cancer Res. 41:2073 (1981); 
*S. Honda, Y. Ichimori, S. Iwasa, Cytotechnology 4:59 (1990)). Another 
important use has been to deliver anti-cancer non-protein drugs to specific 
cellular targets (J. Corvalan, W. Smith, V. Gore, IntL J. Cancer SuppL 2:22 
(1988); Pimm etal., British J. cf Cancer 5i:508 (1990)). Such bispecific 
antibodies have been prepared by chemical cross-linking (M. Brennan et aL, 
Science 229:81 (1985)), disulfide exchange, or the production of hybrid- 
hybridomas (quadromas). Quadromas are constructed by fusing hybridomas 
that secrete two different types of antibodies against two different antigens 
OKurokawa, T. et aL, Biotechnology 7:1163 (1989)). 



Summaiy of the Invention 



This invention relates to the discovery that multivalent forms of single- 
chain antigen-binding proteins have significant utili^ beyond that of the 
monovalent single-chain antigen-binding proteins. A multivalent antigen- 
binding protein has more than one antigen-binding site. Enhanced binding 
activity, di- and multi-specific binding, and other novel uses of multivalent 
antigen-binding proteins have been demonstrated or are envisioned here. 
Accordingly, the invention is directed to multivalent forms of single-chain 
antigen-binding proteins, compositions of multivalent and single-chain antigen- 
binding proteins, methods of making and purifying multivalent forms of single- 
chain antigen-binding proteins, and uses for multivalent forms of single-chain 
antigen-binding proteins. The invention provides a multivalent antigen-binding 
protein comprising two or more single-chain protein molecules, each single- 
chain molecule comprising a first polypeptide comprising the binding portion 
of the variable region of an antibody heavy or light chain; a second 
polypeptide comprising the binding portion of the variable region of an 
antibody heavy or light chain; and a peptide linker linking the first and second 
polypeptides into a single-chain protein. 
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Also provided is a composition comprising a multivalent antigen- 
binding protein substantially free of single-chain molecules. 

Also provided is an aqueous composition comprising an excess of 
multivalent antigen-binding protein over single-chain molecules. 
5 A method of producing a multivalent antigen-binding protein is 

provided, comprising the steps of producing a composition comprising 
multivalent antigen-binding protein and single-chain molecules, each single- 
chain molecule comprising a first polypeptide comprising the binding portion 
of the variable region of an antibody heavy or light chain; a second 
i^lO polypeptide comprising the binding portion of the variable region of an 

antibody heavy or light chain; and a peptide linker linking the first and second 
polypeptides into a single-chain molecule; separating the multivalent protein 
from the single-chmn molecules; and recovering the multivalent protein. 

Also provided is a method of producing multivalent antigen-binding 
15 protein, comprising the steps of producing a composition comprising single- 

chain molecules as previously defined; dissociating the single-chain molecules; 
reasspciating the single-chain molecules; separating the resulting multivalent 
antigen-binding proteins from the single-chain molecules; and recovering the 
multivalent proteins. 

20 Also provided is another method of producing a multivalent antigen- 

binding protein, comprising the step of chemically cross-linking at least two 
single-chain antigen-binding molecules. 

• Also provided is another method of producing a multivalent antigen- 
binding protein, comprising the steps of producing a composition comprising 
25 single-chain molecules as previously defined; concentrating said single-chain 

molecules; separating said multivalent protein from said single-chain 
molecules; and finally recovering said multivalent protein. 

Also provided is another method of producing a multivalent antigen- 
binding protein comprising two or more single-chain molecules, each single- 
30 chain molecule as previously defined, said method comprising: providing a 

genetic sequence coding for said single-chain molecule; transforming a host 
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cell or cells with said sequence; expressing said sequence in said host or hosts; ' 
and recovering said multivalent protein. 

Another aspect of the invention includes a method of detecting an 
antigen in or suspected of being in a sample, which comprises contacting said 
sample with the multivalent andgen-binding protein of claim 1 and detecting 
whether said multivalent antigen-binding protein has bound to said antigen. 

Another aspect of the invention includes a method of imaging the 
internal structure of an animal, comprising administering to ^id animal an 
effective amount of a labeled form of the multivalent antigen-binding protein 
of claim 1 and measuring detectable radiation associated with said animal. 

Another aspect of the invention includes a composition comprising an 
association of a multivalent antigen-binding protein with a therapeutically or 
diagnostically effective agent. 

Another aspect of this invention is a single-chain protein comprising: 
a first polypeptide comprising the binding portion of the variable region of an 
antibody light chain; a second polypeptide comprising the binding portion of 
the variable region of an antibody light chain; a peptide linker linking said first 
and second polypeptides (a) and (b) into said single-chain protein. 

Another aspect of the present invention includes the genetic 
constructions encoding the combinations of regions Vl-Vl and Vh-Vh ft)r 
single-chain molecules, and encoding multivalent antigen-binding proteins. 

Another part of this invention is a multivalent single-chain antigen- 
binding protein comprising: a first polypeptide comprising the binding portion 
of the variable region of an antibody heavy or light chain; a second 
polypeptide comprising the binding portion of the variable region of an 
antibody heavy or light chain; a peptide linker linking said first and second 
polypeptides (a) and (b) into said multivalent protein; a third polypeptide 
comprising the binding portion of the variable region of an antibody heavy or 
light chain; a fourth polypeptide comprising the binding portion of the variable 
region of an antibody heavy or light chain; a peptide linker linking said third 
and fourth polypeptides (d) and (e) into said multivalent protein; and a peptide 
linker linking said second and third polypeptides (b) and (d) into said 
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multivalent protein. Also included are gentic constructions coding for this 
multivalent single-chain antigen-binding protein. 

Also included are repHcable cloning or expression vehicles including 
plasmids, hosts transformed with the aforementioned genetic sequences, and 
methods of producing multivalent proteins with the sequences, transformed 
hosts, and expression vehicles. 

Methods of use are provided, such as a method of using the multivalent 
antigen-binding protein to diagnose a medical condition; a method of using the 
multivalent protein as a carrier to image the specific bodily organs of an 
animal; a therapeutic method of using the multivalent protein to treat a medical 
condition; and an immunotherapeutic method of conjugating a multivalent 
protein with a therapeutically or diagnostically effective agent. Also included 
are labelled multivalent proteins, improved immunoassays using them, and 
improved immunoaffinity purifications. 

An advantage of using multivalent antigen-binding proteins instead of 
single-chain antigen-binding molecules or Fab fragments lies in the enhanced 
binding abiliQr of the multivalent form. Enhanced binding occurs because the 
multivalent form has more binding sites per molecule. Another advantage of 
the present invention is the ability to use multivalent antigen-binding proteins 
as multi-specific binding molecules. 

An advantage of using multivalent antigen-binding proteins instead of 
whole antibodies, is the enhanced clearing of the multivalent antigen-binding 
proteins from the serum due to their smaller size as compared to whole 
antibodies which may afford lower background in imaging applications. 
Multivalent antigen-binding proteins may penetrate solid tumors better than 
monoclonals, resulting in better tumor-fighting ability. Also, because they are 
smaller and lack the Fc component of intact antibodies, the multivalent 
antigen-binding proteins of the present invention may be less immunogenic 
than whole antibodies. The Fc component of whole antibodies also contains 
binding sites for liver, spleen and certain other cells and its absence should 
thus reduce accumulation in non-target tissues. 
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Another advantage of multivalent antigen-binding proteins is the ease 
with which they may be produced and engineered, as compared to the 
myeloma-fiising technique pioneered by Kohler and Mllstein that is used to 
produce whole antibodies. 

Brief Description of the Drawings. 

The present invention as defined in the claims can be better understood 
with reference to the text and to the following drawings: 

RG. lA is a schematic two-dimensional representation of two identical 
single-chain antigen-binding protein molecules, each comprising a variable 
light chain region (Vl), a variable heavy chain region (Vh), and a polypeptide 
linker joining the two regions. The single-chain antigen-binding protein 
molecules are shown binding antigen in their antigeurbinding sites. 

FIG. IB depicts a hypothetical homodivalent antigen-binding protein 
formed by association of the polypeptide linkers of two monovalent single- 
chain antigen-binding proteins from Fig. lA (the Association model). The 
divalent antigen-binding protein is formed by. tiie concentration-driven 
association of two identical single-chain antigen-binding protein molecules. 

FIG. IC depicts the hypothetical divalent protein of FIG. IB wifli 
bound antigen molecules occupying both antigen-binding sites. 

FIG. 2 A depicts the hypothetical homodivalent protein of Figure IB. 
FIG. 2B depicts three single-chain antigen-binding protein molecules 
associated in a hypothetical trimer. 

FIG. 2C depicts a hypothetical tetramer of four single-chain antigen- 
binding protein molecules. 

FIG. 3A depicts two separate and disdnct monovalent single-chain 
antigen-binding proteins^ Anti-A single-chain antigen-binding protein and Anti- 
B single-chain antigen-binding protein, with different antigen specificities, each 
individually binding either Antigen A or Antigen B. 
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FIG, 3B depicts a hypothetical bispecific heterodi valent antigen-binding 
protein formed from the single-chain antigen-binding proteins of Fig. 3 A 
according to the Association modeh 

FIG- 3C depicts the hypothetical heterodi valent antigen-binding protein 
of FIG, 3B binding bispecifically, i.e., binding the two different antigens, A 
and B. 

FIG. 4A depicts two identical single-chain antigen-binding protein 
molecules, each having a variable light chain region (VJ, a variable heavy 
chain region and a polypeptide linker joining the two regions. The 
single-chain antigen-binding protein molecules are shown binding identical 
antigen molecules in their antigen-binding sites. 

FIG. 4B depicts a hypothetical homodivalent protein formed by the 
rearrangement of the Vl and Vh regions shown in FIG. 4A (the 
Rearrangement model). Also shown is bound antigen. 

FIG. 5A depicts two single-chain protein molecules, the first having an 
anti-B and an anti-A Va, and the second having an anti-A Vl and an anti-B 
V„. The figure shows the non-complementary nature of the Vl and V„ 
regions in each single-chain protein molecule. 

FIG. 5B shows a hypothetical bispecific heterodi valent antigen-binding 
protein formed by rearrangement of the two single-chain proteins of Figure 



FIG. 5C depicts the hypothetical heterodi valent antigen-binding protein 
of FIG. 5B with different antigens A and B occupying their respective antigen- 
binding sites. 

FIG. 6A is a schematic depiction of a hypothetical tri valent antigen- 
binding protein according to the Rearrangement model. 

FIG. 6B is a schematic depiction of a hypothetical tetravalent antigen- 
binding protein according to the Rearrangement model. 

FIG. 7 is a chromatogram depicting the separation of CC49/212 
antigen-binding protein monomer from dimer on a cation exchange high 
performance liquid chromatographic column. The column is a PolyCAT A 
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aspartic acid column (Poly WC, Columbia. MD). Monomer is shown as Peak 
1, eluting at 27.32 min., and dimer is shown as Peak 2, eluting at 55.52 min. 

FIG. 8 is a chromatogram of the purified monomer ftom Fig. 7. 
Monomer elutes at 21.94 min., preceded by dimer (20.135 min.) and trimer 
5 (18.640 min.). Gel filtration column, Protein-Pak 300SW (Waters Associates. 

Milford, MA). 

FIG. 9 is a similar chromatogram of purified dimer (20.14 min.) from 
Fig. 7, run on the gel filtration HPLC column of Fig. 8. 

HG. iOA is an amino acid (SEQ ID NO. 11) and nucleotide (SEQ ID 
10 NO. 10) sequence of the single-chain protein comprising the 4-4-20 region 

connected through the 212 linker polypeptide to the CC49 Vh region. - 

FIG. lOB is an amino acid (SEQ ID NO. 13) and nucleotide (SEQ ID 
NO. 12) sequence of the single-chain protein comprising the CC49 Vj, region 
connected through the 212 linker polypeptide to the 4-4-20 Vh region. 
15 HG. 11 is a chromatogram depicting the separation of the monomer 

(27.83 min.) and dimer (50.47 min.) forms of the CC49/212 antigen-binding 
protein by cation exchange, on a PolyCAT A cation exchange column (Poly 
LC, Columbia, MD). 

Fig. 12 shows the separation of monomer (17.65 min.), dimer (15.79 
20 min.). trimer (14.19 min.), and higher oligomers (shoulder at about 13.09 

' min.) of the B6.2/212 antigen-binding protein. This separation depicts the 
results of a 24-hour treatment of a 1.0 mg/ml B6.2/212 single-chain antigen- 
binding protein sample. A TSK G2000SW gel filtration HPLC column was 
used, Toyo Soda, Tokyo, Japan. 
25 Fig. 13 shows the results of a 24-hour treatment of a 4.0 mg/ml 

CC49/212 antigen-binding protein sample, generating monomer, dimer. and 
trimer at 16.91. 14.9. and 13.42 min.. respectively. The same TSK gel 
filtration column was used as in Fig. 12. 

Rg. 14 shows a schematic view of the four-chain structure of a human 

30 IgG molecule- 
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Fig. 15A is an amino acid (SEQ ID NO, 15) and nucleotide (SEQ ID 
NO. 14) sequence of the 4-4-20/2.12 single-chain antigen-binding protein with 
a single cysteine hinge. 

Fig. 15B is an amino acid (SEQ ID NO. 17) and nucleotide (SEQ. ID 
NO. 16) sequence of the 4-4-20/212 single-chain antigen-binding protein with 
the two-cysteine hinge. 

Fig. 16 shows the amino acid (SEQ ID NO. 19) and nucleotide (SEQ 
II) SO. 18) sequence of a divalent CC49/212 single-chain antigen-binding 
prixcin. 

Fig. 17 shows the expression of the divalent CC49/212 single-chain 
antigcn-binding protein of Fig. 16 at 42°C, on an SDS-PAGE gel containing 
total £. coU protein. Lane 1 contains the molecular weight standards. Lane 
2 IS the uninduced E. coli production strain grown at 30°C. Lane 3 is divalent 
CC49/2I2 single-chain antigen-binding protein induced by growth at 42 "^C. 
The arrow shows the band of e;q>ressed divalent CC49/212 single-chain 
antigen-binding protein. 

Fig. 18 is a graphical representation of four competition 
radioimmunoassays (RIA) in which unlabeled CC49 IgG (open circles) 
CC49/212 single-chain antigen-binding protein (closed circles) and CC49/212 
divalent anugen-binding protein (closed squares) and anti-fluorescein 4-4- 
20*^12 single-chain antigen-binding protein (open squares) competed against 
a CC49 IgG radiolabeled with ^^I for binding to the TAG-72 antigen on a 
human breast carcinoma extract. 

Figure 19A is an amino acid (SEQ ID NO. 21) and nucleotide (SEQ 
ID NO. 20) sequence of the single-chain polypeptide comprising the 4-4-20 
region connected through the 217 linker polypeptide to the CC49 Vh region. 

Figure 19B is an amino acid (SEQ ID NO. 23) and nucleotide (SEQ 
ID NO. 22) sequence of the single-chain polypeptide comprising the CC49 Vl 
region connected through the 217 linker polypeptide to the 4-4-20 V„ region. 

Figure 20 is a chromatogram depicting the purification of CC49/4-4-20 
heterodimer Fv on a cation exchange high performance liquid chromatographic 
column. The column is a PolyCAT A aspartic acid column (Poly LC, 
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Columbia, MD)^ The heterodimer Fv is shown as peak 5, eluting at 30.10 
mill- 
Figure 21 is a coomassie-blue stained 4-20% SDS-PAGE gel showing 
the proteins separated in Figure 20. Lane 1 contains the molecular weight 
standards. Lane 3 contains the starting material before separation. Lanes 4-8 
contain fractions 2, 3, 5, 6 and 7 respectively. Lane 9 contains purified 
CC49/212, 

Figure 22A is a chromatogram used to determine the molecular size of 
ftaction 2 from Figure 20. A TSK G3000SW gel filtration HPLC column was 
used (Toyo Soda, Tokyo, Japan). 

Figure 22B is a chromatogram used to determine the molecular size of 
ftaction 5 from Figure 20. A TSK G3000SW gel filtration HPLC column was 
used (Toyo Soda, Tokyo, Japan). 

Figure 22C is a chromatogram used to determine the molecular size of 
ftaction 6 from Figure 20. A TSK G30005Wgel filtration HPLC column was 
used (Toyo Soda, Tokyo, Japan). 

Figure 23 shows a Scatchard analysis of the fluorescein binding affinity 
of the CC49 4-4-20 heterodimer Fv (fraction 5 in Figure 20). 

Figure 24 is a graphical representation of tiiree competition enzyme- 
linked immunosorbent assays (ELISA) in which unlabeled CC49 4-4-20 Fv 
(closed squares) CC49/212 single-chain Fv (open squares) and MOPC-21 IgG 
(+) competed against a biotin-Iabeled CC49 IgG for binding to flie TAG-72 
antigen on a human breast carcinoma extract MOPC-21 is a control antibody 
that does not bind to TAG-72 antigen. 

Figure 25 shows a coomassie-blue stained non-reducing 4-20% SDS- 
PAGE gel. Lanes 1 and 9 contain the molecular weight standards. Lane 3 
contains the 4-4-20/212 CPPC single-chain antigen-binding protein after 
purification. Lane 4, 5 and 6 contain the 4-4-20/212 CPPC single-chain 
antigen-binding protein after treatment with DTT and air oxidation. Lane 7 
contains 4-4-20/212 single-chain antigen-binding protein. 

Figure 26 shows a coomassie-blue stained reducing 4-20% SDS-PAGE 
gel (samples were treated with jS-mercaptoethanol prior to being loaded on the 
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gel). Lanes 1 and 8 contain the molecular weight standards. Lane 3 contains 
the 4-4-20/212 CPPC single-chain antigen-binding protein after treatment with 
to-maleimidehexane. Lane 5 contains peak 1 of Z;w-maleimidehexane treated 
4-4-20/212 CPCC single-chain antigen-binding protein. Lane 6 contains peak 
3 of Wj-maleimidehexane treated 4-4-20/212 CPPC single-chain antigen- 
binding protein. 



Detailed Description of the Preferred Embodiments 



This invention relates to the discovery that multivalent forms of single- 
chain antigen-binding proteins have significant utility beyond that of the 
monovalent single-chain antigen-binding proteins. A multivalent antigen- 
binding protein has more than one antigen-binding site. For the purposes of 
this application, "valent" refers to the numerosity of antigen binding sites. 
Thus, a bivalent protein refers to a protein with two binding sites. Enhanced 
binding activity, bi- and multi-specific binding, and other novel uses of 
multivalent antigen-binding proteins have been demonstrated or are envisioned 
fherc.: Accordingly, the invention is directed to multivalent forms of single- 
chain antigen-binding proteins, compositions of multivalent and single-chain 
antigen-binding proteins, methods of making and purifying multivalent forms 
of single-chain antigen-binding proteins, and new and improved uses for 
multivalent forms of single-chain antigen-binding proteins. The invention 
provides a multivalent antigen-binding protein comprising two or more single- 
chain protein molecules, each single-chain molecule comprising a first 
polypeptide comprising the binding ponion of the variable region of an 
antibody heavy or light chain; a second polypeptide comprising the binding 
portion of the variable region of an antibody heavy or light chain; and a 
peptide linker linking the first and second polypeptides into a single-chain 
protein. 

The term "multivalent" means any assemblage, covalently or non- 
covalently joined, of two or more single-chain proteins, the assemblage having 
more than one antigen-binding site. The single-chain proteins composing the 
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a.sscmblage may have antigen-binding activity, or they may lack antigen- 
binding activity individually but be capable of assembly into active multivalent 
antigen-binding proteins. The term "multivalent" encompasses bivalent, 
trivalent, tetravalent, etc. It is envisioned that multivalent forms above 
bivalent may be useful for certain applications. 

A preferred form of the multivalent antigen-binding protein comprises 
bivalent proteins, including heterobivalent and homobivalent forms. The term 
•bivalent* means an assemblage of single-chain proteins associated witii each 
other to form two antigen-binding sites. The term "heterobivalent" indicates 
multivalent andgen-binding proteins that are bispecific molecules capable of 
binding to two different antigenic determinants. Therefore, heterobivalent 
proteins have two antigen-binding sites that have different binding specificities. 
The term "homobivalent" indicates that the two binding sites are for the same 
antigenic determinanL 

The terms "single-chain molecule" or "single-chain protein" are used 
interchangeably here. They are structurally defined as comprising the binding 
portion of a first polypeptide from the variable region of an antibody, 
associated with the binding portion of a second polypeptide from the variable 
region of an antibody, the two polypeptides being joined by a peptide linker 
linking the first and second polypeptides into a single polypeptide chain. The 
single polypeptide chain thus comprises a pair of variable regions connected 
by a polypeptide linker. The regions may associate to form a functional 
antigen-binding site, as in the case wherein the regions comprise a light-chain 
and a heavy-chain variable region pair with i^propriately paired 
complementarity determining regions (CDRs). In this case, the single-chain 
protein is referred to as a "single-chain antigen-binding protein" or "single- 
chain antigen-binding molecule." 

Alternatively, the variable regions may have unnaturally paired CDRs 
or may both be derived from the same kind of antibody chain, either heavy or 
light, in which case the resulting single-chain molecule may not display a 
ftinctional antigen-binding site. The single-chain antigen-binding protein 
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molecule is more fully described in U.S. Patent No. 4,946,778 (Ladner et aL), 
and incorporated herein by reference. 

Without being bound by any particular theory, the inventors speculate 



The inventors' models are presented herein for the purpose of illustration only, 
and are not to be construed as limitations upon the scope of the invention. 
The invention is useful and operable regardless of the precise mechanism of 
multivalence. 

Figure 1 depicts the first hypothetical model for the creation of a 
multivalent protein, the "Association" model. Fig. 1 A shows two monovalent 
single-chain antigen-binding proteins, each composed of a Vl, a Vh, and a 
linker polypeptide covalently bridging the two. Each monovalent single-chain 
antigen-binding protein is depicted having an identical antigen-binding site 
containing antigen. Figure IB shows the simple association of the two single- 
chain antigen-binding proteins to create the bivalent form of the multivalent 
protein. It is hypothesized that simple hydrophobic forces between the 
monovalent proteins are responsible for their association in this manner. The 
t origin of the multivalent proteins may be traceable to their concentration 
dependence. The monovalent units retain their original association between 
the Vh and Vl regions. Figure IC shows the newly-formed homobivalent 
protein binding two identical antigen molecules simultaneously. Homobivalent 
antigen-binding proteins are necessarily monospecific for antigen. 

Homovalent proteins are depicted in Figs. 2A through 2C formed 
according to the Association model. Fig. 1 A depicts a homobivalent protein. 
Fig. 2B a trivalent protein, and Fig. 2C a tetravalent protein. Of course, the 
limitations of two-dimensional images of three-dimensional objects must be 
taken into account. Thus, the actual spatial arrangement of multivalent 
proteins can be expected to vary somewhat from these figures. 

A heterobivalent antigen-binding protein has two different binding sites, 
the sites having different binding specificities. Figures 3A through C depict 
the Association model pathway to the creation of a heterobivalent protein. 
Figure 3A shows two monovalent single-chain antigen-binding proteins, Anti- 



on several models which can equally explain the phenomenon of multivalence. 
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A single-chain antigen-binding protein and Anti-B single-chain antigen-binding 
protein, with antigen types A and B occupying the respective binding sites. 
Figure 3B depicts the heterobivalent protein formed by the simple association 
of the original monovalent proteins. Figure 3C shows the heterobivalent 
5 protein having bound antigens A and B into flie antigen-binding sites. Figure 

3C therefore shows the heterobivalent protein binding in a bispecific manner. 

An alternative model for fte formation of multivalent antigen-binding 
proteins is shown in Figures 4 through 6. This "Rearrangement" model 
hypothesizes the dissociation of the variable region interface by contact wiA 

10 dissociating agents such as guanidine hydrochloride, urea, or alcohols such as 

ethanol, either alone or in combination. Combinations and relevant 
concentration ranges of dissociating agents are recited in the discussion 
concerning dissociating agents, and in Example 2. Subsequent re-association 
of dissociated regions allows variable region recombination differing from the 

15 starting single-chain proteins, as depicted in Fig. 4B. The homobivalent 

antigen-binding protein of Figure 4B is formed from tiie parent single-chain 
antigen-binding proteins shown in Figure 4A, the recombined bivalent protein 
having Vl and V„ from the parent monovalent single-chain proteins. The 
homobivalent protein of Figure 4B is a fuHy functional monospecific bivalent 

20 protein, shown actively binding two antigen molecules. 

Rgures 5A-5C show die formation of heterobivalent antigen-binding 
proteins via tiie Rearrangement model. Figure 5A shows a pair of single- 
chain proteins, each having a Vl with complementarity determining regions 
(CDRs) that do not match tiiose of the associated Vh- These single-chain 

25 proteins have reduced or no ability to bind antigen because of the mixed 

nature of tiieir antigen-binding sites, and thus are made specifically to be 
assembled into multivalent proteins tfirough tiiis route. Figure SB shows tiie 
heterobivalent antigen-binding protein formed whereby die Vh and Vl regions 
of the parent proteins are shared between tiie separate halves of the 

30 heterobivalent protein. Figure 5C shows the binding of two different antigen 

molecules to the resultant functional bispecific heterobivalent protein. The 
Rearrangement model also explains flie generation of multivalent proteins of 
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a higher order than bivalent, as it can be appreciated that more than a pair of 
single-chain proteins can be reassembled in this manner. These are depicted 
in Figures 6A and 6B. 

One of the major utilities of the multivalent antigen-binding protein is 
in the heterobivalent form, in which one specificity is for one type of hapten 
or antigen, and the second specificity is for a second type of hapten or 
antigen. A multivalent molecule having two distinct binding specificities has 
many potential uses. For instance, one antigen binding site may be specific 
for a cell-surface epitope of a target cell, such as a tumor cell or other 
undesirable cell. The other antigen-binding site may be specific for a cell- 
surface epitope of an effector cell, such as the CD3 protein of a cytotoxic. T- 
celL In this way, the heterobivalent antigen-binding protein may guide a 
cytotoxic cell to a paaicular class of cells that are to be preferentially 
attacked. 

Other uses of heterobivalent antigen-binding proteins are the specific 
targeting and destruction of blood clots by a bispecific molecule with 
specificity for tissue plasminogen activator (tPA) and fibrin; the specific 
.targeting of pro-drug activating enzymes to tumor cells by a bispecific 
molecule with specificity for tumor cells and enzyme; and specific targeting 
of cytotoxic proteins to tumor cells by a bispecific molecule with specificity 
for tumor cells and a cytotoxic protein. This list is illustrative only, and any 
use for which a multivalent specificity is appropriate comes within the scope 
of this invention. 

The invention also extends to uses for the multivalent antigen-binding 
proteins in purification and biosensors. Affinity purification is made possible 
by affixing the multivalent antigen-binding protein to a support, with the 
antigen-binding sites exposed to and in contact with the ligand molecule to be 
separated, and thus purified. Biosensors generate a detectable signal upon 
binding of a specific antigen to an antigen-binding molecule, with subsequent 
processing of the signal. Multivalent antigen-binding proteins, when used as 
the antigen-binding molecule in biosensors, may change conformation upon 
binding, thus generating a signal that may be detected. 
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Essentially all of the uses for which monocloDal or polyclonal 
antibodies, or fragments thereof, have been envisioned by the prior art, can 
be addressed by the multivalent proteins of the present invention. These uses 
include detectably-Iabelled forms of the multivalent protein. Types of labels 
are well-known to those of ordinary skill in the art. They include 
radiolabelling, chemiluminescent labeling, fluorochromic labelling, and 
chromophonc labeling. Other uses include imaging flie internal strucmre of 
an animal (including a human) by administering an effective amount of a 
labelled form of the multivalent protein and measuring detectable radiation 
associated with the animal. They also include improved immunoassays, 
including sandwich immunoassay, competitive immunoassay, and other 
immunoassays wherein the labelled antibody can be replaced by the 
multivalent antigen-binding protein of this invention. 

A first preferred method of producing multivalent antigen-binding 
proteins involves separating the multivalent proteins from a production 
comp.>sition that comprises both multivalent and single-chain proteins, as 
represented in Example 1. The method comprises producing a composition 
of multivalent and single-chain proteins, separating the multivalent proteins 
from the single-chain proteins, and recovering the multivalent proteins. 

A second preferred method of producing multivalent antigen-binding 
proteins comprises the steps of producing single-chain protein molecules, 
dissociating said single-chain molecules, reassociating the single-chain 
molecules such that a significant fraction of the resulting composition includes 
multivalent forms of the single-chain antigen-binding proteins, separating 
multivalent antigen-binding proteins from single-chain molecules, and 
recovering the multivalent proteins. This process is illustrated with more 
detail in Example 2. For the purposes of this method, the term "producing a 
composition comprising single-chain molecules" may indicate the actual 
production of these molecules. The term may also include procuring them 
from whatever commercial or institutional source makes them available. Use 
of the term "producing single-chain proteins" means production of single-chain 
proteins by any process, but preferably according to the process set forth in 
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U.S. Patent No. 4,946,778 (Ladner et aL). Briefly, that patent pertains to a 
single polypeptide chain antigen-binding molecule which has binding 
specificity and affinity substantially similar to the binding specificity and 
affinity of the aggregate light and heavy chain variable regions of an antibody, 
to genetic sequences coding therefore, and to recombinant DNA methods of 
producing such molecules, and uses for such molecules. The single-chain 
protein produced by the Ladner et al. mediodology comprises two regions 
linked by a linker polypeptide. The two regions are termed the Vh and 
regions, each region comprising one half of a functional antigen-binding site. 

The term "dissociating said single-chain molecules" means to cause the 
physical separation of the two variable regions of the single-chain protein 
without causing denaturation of the variable regions. 

"Dissociating agents" are defined herein to include all agents capable 
of dissociating the variable regions, as defined above. In the context of this 
invention, the term includes the well-known agents alcohol (including ethanol), 
guanidine hydrochloride (GuHCl), and urea. Others will be apparent to those 
of ordinary skill in the art, including detergents and similar agents capable of 
interpipting the interactions that maintain protein conformation. In the 
preferred embodiment, a combination of GuHCl and ethanol (EtOH) is used 
as the dissociating agent. A preferred range for ethanol and GuHCl is from 
0 to 50% EtOH, vol/vol, 0 to 2.0 moles per liter (M) GuHCL A more 
preferred range is from 10-30% EtOH and 0.5-1.0 M GuHCl, and a most 
preferred range is 20% EtOH, 0.5 M GuHCl. A preferred dissociation buffer 
contains 0.5 M guanidine hydrochloride, 20% ethanol, 0.05 M TRIS, and 
0.01 M CaClj, pH 8.0. 

Use of the term "re-associating said single-chain molecules" is meant 
to describe the reassociation of the variable regions by contacting them with 
a buffer solution that allows reassociation. Such a buffer is preferably used 
in the present invention and is characterized as being composed of 0.04 M 
MOPS, 0.10 M calcium acetate, pH 7.5. Other buffers allowing the 
reassociation of the Vl and Vh regions are well within the expertise of one of 
ordinary skill in the art. 
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The separation of the multivalent protein from the single-chain 
molecules occurs by use of standard techniques known in the art, particularly 
including cation exchange or gel filtration chromatography. 

Cation exchange chromatography is the general liquid chromatographic 
technique of ion-exchange chromatography utilizing anion columns well-known 
to those of ordinary skill in the art. In ±is invention, the cations exchanged 
are the single-chain and multivalent protein molecules. Since multivalent 
proteins will have some multiple of the net charge of the single-chain 
molecule, the multivalent proteins are retained more strongly and are thus 
separated from tbe single-chain molecules. The preferred cationic exchanger 
of the present invention is a polyaspartic acid column, as shown in Figure 7. 
Figure 7 depicts the separation of single-chain protein (Peak 1, 27.32 min.) 
from bivalent protein (Peak 2, 55.54 minO Those of ordinary skill in the art 
will realize that the invention is not limited to any particular type of 
chromatography column, so long as it is capable of separating the two forms 
of protein molecules. 

Gel filtration chromatography is the use of a gel-like material to 
separate proteins on the basis of their molecular weight. A "gel" is a matrix 
of water and a polymer, such as agarose or polymerized acrylamide. The 
present invention encompasses the use of gel filtration HPLC (high 
performance liquid chromatogrsf^hy), as will be appreciated by one of ordinary 
skill in the art. Figure 8 is a chromatogram depicting the use of a Waters 
Associates' Protein-Pak 300 SW gel filtration column to separate monovalent 
single-chain protein firom multivalent protein, including the monomer (21 .940 
min.), bivalent protein (20.135 min.), and trivalent protein (18.640 min.). 

Recovering the multivalent antigen-binding proteins is accomplished by 
standard collection procedures well known in the chemical and biochemical 
arts. In the context of the present invention recovering the multivalent protein 
preferably comprises collection of eluate fractions containing the peak of 
interest from either the cation exchange column, or the gel filtration HPLC 
column. Manual and automated fraction collection are well-known to one of 
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ordinary skill in the art. Subsequent processing may involve lyophilization of 
the eluate to produce a stable solid, or further purification. 

A third preferred method of producing multivalent antigen-binding 
proteins is to start with purified single-chain proteins at a lower concentration, 
and then increase the concentration until some significant fraction of 
multivalent proteins is formed. The multivalent proteins are then separated 
and recovered. The concentrations conducive to formation of multivalent 
proteins in this manner are from about 0.5 milligram per milliliter (mg/ml) to 
the concentration at which precipitates begin to form. 

The use of the term "substantially free" when used to describe a 
composition of multivalent and single-chain antigen-binding protein molecules 
means the lack of a significant peak corresponding to the single-chain 
molecule, when the composition is analyzed by cation exchange 
chromatography, as disclosed in Example 1 or by gel filtration 
chromatography as disclosed in Example 2. 

By use of the term "aqueous composition" is meant any composition 
of single-chain molecules and multivalent proteins including a portion of 
wateri? In the same context, the phrase "an excess of multivalent antigen- 
bindihg protein over single-chain molecules" indicates that the composition 
comprises more than 50% of multivalent antigen-binding protein. 

The use of the term "cross-linking" refers to chemical means by which 
one can produce multivalent antigen-binding proteins from monovalent single- 
chain protein molecules. For example, the incorporation of a cross-linkable 
sulfhydryl chemical group as a cysteine residue in the single-chain proteins 
allows cross-linking by mild reduction of the sulfhydryl group. Both 
monospecific and multispecific multivalent proteins can be produced from 
single-chain proteins by cross-linking the free cysteine groups from two or 
more single-chain proteins, causing a covalent chemical linkage to form 
between the individual proteins. Free cysteines have been engineered into the 
C-terminal portion of the 4-4-20/212 single-chain antigen-binding protein, as 
discussed in Example 5 and Example 8. These free cysteines may then be 
cross-linked to form multivalent antigen-binding proteins. 
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The invention also comprises single-chain proteins, comprising: (a) a 
first polypeptide comprising tiie binding portion of tiie variable region of an 
antibody light chain; (b) a second polypeptide comprising tiie binding portion 
of tiie variable region of an antibody light chain; and (c) a peptide linker 
linking said first and second polypeptides (a) and (b) into said single-chain 
protein. A similar single-chain protein comprising the heavy chain variable 
regions is also a part of tiiis invention. Genetic sequences encoding these 
molecules are also included in die scope of tiiis invention. Since fliese proteins 
are comprised of two similar variable regions, they do not necessarily have 
any antigen-binding capabiUty. 

The invention also includes a DNA sequence encoding a bi$pecific 
bivalent antigen-binding protein. Example 4 and Example 7 discusses in detail 
tiie sequences that appear in Figs. lOA and lOB that allow one of ordinary 
skill to construct a heterobivalent antigen-binding molecule. Figure lOA is an 
amino acid and nucleotide sequence listing of the single-chain protein 
comprising the 4 4-20 region connected through flie 212 linker polypeptide 
to the CC49 Vh region. Figure lOB is a similar listing of the single-chain 
protein comprising the CC49 region connected tiirough flie 212 linker 
polypeptide to the 4-4-20 Vh region. Subjecting a composition including these 
single-chain molecules to dissociating and subsequent re-associating conditions 
results in the production of a bivalent protein with two different binding 
specificities. 

Syntiiesis of DNA sequences is well know in the art, and possible 
through at least two routes. Rrst, it is well-known tiiat DNA sequences may 
be synthesized through the use of automated DNA synth^izers de novo^ once 
the primary sequence information is known. Alternatively, it is possible to 
obtain a DNA sequence coding for a multivalent single-chain antigen-binding 
protein by removing the stop codons from the end of a gene encoding a single- 
chain antigen-binding protein, and then inserting a linker and a gene encoding 
a second single-chain antigen-binding protein. Example 6 demonstrates the 
construction of a DNA sequence coding for a bivalent single-chain antigen- 
binding protein- Otiier methods of genetically constructing multivalent single- 
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chain antigen-binding proteins come within the spirit and scope of the present 
invention. 

Having now generally described this invention the same will better be 
understood by reference to certain specific examples which are included for 
purposes of illustration and are not intended to limit it unless otherwise 
specified. 

Example 1 

Production of Multivalent 
Antigen-Binding Proteins During Purification 

In the production of multivalent antigen-binding proteins, the same 
recombinant E. coli production system that was used for prior single-chain 
antigen-binding protein production was used. See Bird, et al. , Science 242:423 
(1988). This production system produced between 2 and 20% of the total E. 
coli protein as antigen-binding protein. For protein recovery, the frozen cell 
paste from three 10-liter fermentations (600-900 g) was thawed overnight at 
4°C and gently resuspended at 4^C in 50 mM Tris-Hcl, 1.0 mM EDTA, 100 
mM KCl, 0. 1 mM PMSF, pH 8.0 (lysis buffer), using 10 liters of lysis buffer 
for every kilogram of wet cell paste. When thoroughly resuspended, the 
chilled mixture was passed three times through a Manton-Gaulin cell 
homogenizer to totally lyse the cells. Because the cell homogenizer raised the 
temperature of the cell lysate to 25 ±5''C, the cell lysate was cooled to 
S±2^C with a Lauda/Brinkman chilling coil after each pass. Complete lysis 
was verified by visual inspection under a microscope. 

The cell lysate was centrifiiged at 24,300g for 30 min. at 6**C using a 
Sorvall RC-5B centrifuge. The pellet containing the insoluble antigen-binding 
protein was retained, and the supernatant was discarded. The pellet was 
washed by gently scraping it from the centrifuge bottles and resuspending it 
in 5 liters of lysis buffer/kg of wet cell paste. The resulting 3.0- to 4.5-liter 
suspension was again centrifuged at 24,300g for 30 min at 6**C, and the 
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supematant was discarded. This washing of the pellet removes soluble E. coli 
proteins and can be repeated as many as five times. At any time during this 
washing procedure the material can be stored as a frozen pellet at -20**C. A 
substantial time saving in the washing steps can be accomplished by utilizing 
5 a Pellicon tangential flow apparatus equipp«i with 0-22-/tm microporous 

filters, in place of centrifiigation. 

The washed pellet was solubilized at 4°C in fiieshly prepared 6 M 
guanidine hydrochloride, 50 mM Tris-HCl, 10 mM CaCIa, 50 mM KCl, pH 
8.0 (dissociating buffer), using 9 ml/g of pellet. If necessary, a few quick 

10 pulses firom a Heat Systems UltFasonics tissue homogenizer can be used to 

complete the solubilization. The resulting suspension was centrifuged at 
24,300g for 45 min at 6°C and the pellet was discarded. The optical density 
of flie supernatant was determined at 280 nm and if the OD^ was above 30, 
additional dissociating buffer was added to obtain an ODjm of approximately 

15 25. 

The supematr nt was slowly diluted into cold (4-7° C) refolding buffer 
(50 mM Tris-HCl, 10 mM CaQj, 50 mM KCl, pH 8.0) until a 1:10 dilution 
was reached (final volume 10-20 liters). Re-folding occurs over approximately 
eighteen hours under these conditions. The best results are obtained when die 

20 GuHCl extract is slowly added to the refolding buffer over a 2-h period, with 

gentle mixing. The solution was left undistuifoed for at least a20-h period, 
and 95% etfaanol was added to this solution such that the final ethanol 
concentration was ^proximately 20%. This solution was left undisturbed until 
tfie flocculated material setded to tiie bottom, usually not less than sixty 

25 minutes. The solution was filtered dirou^ a 0.2 um Millipore Millipak 200. 

This filtration step may be optionally preceded by a centrifiigation step. TTie 
filtrate was concentrated to 1 to 2 liters using an Amicon spiral cartridge with 
a 10,000 MWCO cartridge, again at 4X. 

The concentrated crude antigen-binding protein sample was dialyzed 

30 against Buffer A (60 mM MOPS, 0.5 mM Ca acetate, pH 6.0-6.4) until tiie 

conductivity was lowered to tiiat of Buffer A, The sample was tiien loaded on 
a 21 .5 x 250-mm polyaspartic acid PolyCAT A column, manufactured by Poly 
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LC of Columbia, Maryland. If more than 60 mg of protein is loaded on this 
column, the resolution begins to deteriorate; thus, the concentrated crude 
sample often must be divided into several PolyCAT A runs. Most antigen- 
binding proteins have an extinction coefficient of about 2.0 ml mg"* cm^ at 
280 mn and this can be used to determine protein concentration. The antigen- 
binding protein sample was eluted from the PolyCAT A column with a 50-min 
linear gradient from Buffer A to Buffer B (see Table 1). Most of the single- 
chain proteins elute between 20 and 26 minutes when this gradient is used. 
This corresponds to an eluting solvent composition of approximately 70% 
Buffer A and 30% Buffer B. Most of the bivalent antigen-binding proteins 
elute later than 45 minutes, which correspond to over 90% Buffer B. 

Figure 7 is a chromatogram depicting the separation of single-chain 
protein from bivalent CC49/2 12 protein, using the cation-exchange method just 
described. Peak 1, 27.32 minutes, represents the monomeric single-chain 
fraction. Peak 2, 55.52 minutes, represents the bivalent protein fraction. 

Rgure 8 is a chromatogram of the purified monomeric single-chain 
antigen-binding protein CC49/212 (Fraction 7 from Fig. 7) run on a Waters 
:i Protejn-Pak 300SW gel filtration column. Monomer, with minor contaminates 
of dimer and trimer, is shown. Figure 9 is a chromatogram of the purified 
bivalent antigen-binding protein CC49/212 (Fraction 15 from Fig. 7) run on 
the same Waters Protein-Pak 300SW gel filtration column as used in Fig. 8. 
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cation exchange and gel filtration chromatography, can be used to separate the 
single-chain protein monomer from the multivalent antigen-binding proteins. 
In the first method, monomeric and multivalent antigen-binding proteins were 
separated by using cation exchange HPLC chromography, using a 
polyaspartate column (PolyCAT A). This was a similar procedure to that used 
in the final purification of die antigen-binding proteins as described in 
Example 1. The load buffer was 0.06 M MOPS, 0.001 M Calcium Acetate 
pH 6.4. In the second method, the monomeric and multivalent antigen- 
binding proteins were separated by gel filtration HPLC chromatography using 
as a load buffer 0.04 M MOPS, 0.10 M Calcium Acetate pH 7.5. Gel 
filtration chromatography separates proteins based on their molecular size. 

Once the antigen-binding protein sample was loaded on the cation 
exchange HPLC column, a linear gradient was ran between the load buffer 
(0.04 to 0.06 M MOPS, 0.000 to 0.001 M calcium acetate, 0 to 10% glycerol 
pH 6.0-6.4) and a second buffer (0.04 to 0.06 M MOPS, 0.01 to 0.02 M 
calcium acetate, 0 to 10% glycerol pH 7.5). It was important to have 
extensively dialyze the antigen-binding protein sample before loading it on the 
column. Normally, the conductivity of the sample is monitored against the 
dialysis buffer. Dialysis is continued until the conductivity drops below 600 
fiS, Figure 11 shows the separation of the monomeric (27.83 min.) and 
bivalent (50.47 min.) forms of the CC49/212 antigen-binding protein by cation 
exchange. The chromatographic conditions for this separation were as 
follows: PolyCAT A column, 200 x 4.6mm, operated at 0.62 ml/min.; load 
buffer and second buffer as in Example I; gradient program from 100 percent 
load buffer A to 0 percent load buffer A over 48 mins; sample was CC49/212, 
1.66 mg/ml; injection volume 0.2 mL Fractions were collected from the two 
pealcs from a similar chromatogram and identified as monomeric and bivalent 
proteins using gel filtration HPLC chromatography as described below. 

Gel filtration HPLC chromatography (TSK G2000SW column from 
Toyo Soda, Tokyo, Japan) was used to identify and separate monomeric 
single-chain and multivalent antigen-binding proteins. This procedure has 
been described by Fukano, et aL, /. Chromatography 166:47 (1978). 
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Multimerization (creation of multivalent protein from monomeric single-chain 
protein) was by treatment with 0.5 M GuHCi and 20% EtOH for the times 
indicated in Table 2A followed by dialysis into the chromatography buffer. 
Figure 12 shows the separation of monomeric (17.65 min.), bivalent (15.79 
min.), trivalent (14.19 min.), and higher oligomers (shoulder at about 13.09 
min.) of the B6.2/212 antigen-binding protein. The B6.2/212 single-chain 
antigen-binding protein is described in Colcher, D,, et aL^ /. Nat. Cancer 
Inst. 52:1191-1197 (1990)). This separation depicts the results of a 24-hour 
multimerization treatment of a 1.0 mg/ml B6.2/212 antigen-binding protein 
sample. The HPLC buffer used was 0.04 M MOPS, 0. 10 M calcium acetate, 
0.04% sodium azide, pH 7.5. 

Figure 13 shows the results of a 24-hour treatment of a 4.0 mg/ml 
CC49/212 antigen-binding protein sample, generating monomeric, bivalent and 
trivalent proteins at 16.91, 14.9, and 13.42 min., respectively. The HPLC 
buffer was 40 mM MOPS, 100 mM calcium acetate, pH 7.35. 
Multimerization treatnient was for the times indicated in Table 2. 

The results of Example 2 A are shown in Table 2A. Table 2 A shows 
€the percentage of bivalent and other multivalent forms before and after 
treatment with 20% ethanol and 0.5M GuHCl. Unless otherwise indicated, 
percentages were determined using a automatic data integration software 
package. 
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Tabie 2A 



Summary of the generation of bivalent and higher 

multivalent forms of B6.2/212 and CC49/212 
proteins using guanidine hydrochloride and ethanol 




Time 


Concentration 




% 






protein 


(hours) 


(mg/ml) 


monomer 


dimer 


trimcr 


mul timers 


CC49/212 


0 


0.25 


86.7 


n.6 


1.7 


0.0 




0 




84.0 


10.6 


5.5 


0.0 




0 


4.0 


70.0 


17.1 


12-9' 


0.0 




2 


0.25' 


62.9 


33.2 


4.2 


0.0 




2 


1.0 


24.2 


70.6 


5.1 


0.0 




2 


4.0 


9.3 


81.3 


9.5 


0.0 




26 


0.25 


16.0 


77.6 


6.4 


0.0 




26 


1.0 


9.2 


82.8 


7.9 


0.0 




26 


4.0 


3.7 


78.2 


18.1 


0.0 


B6.2/2I2 


0 


0.25 


100.0 


0.0 


0.0 


0.0 




0 


1.0 


100.0 


0.0 


0.0 


0.0 




0 


4.0 


100.0 


0.0 


0.0 


0.0 




2 


0.25* 


98.1 


1.9 


0.0 


0.0 




2 


1.0 


100.0 


0.0 


0.0 


0.0 




2 


4.0 


90.0 


5.5 


IJO 


0.0 




24 


0.25 


45.6 


37.5 


10.2 


6.7 




24 


1.0 


50.8 


21.4 


12.3 


15.0 




24 


4.0 


5.9 


37.2 


25.7 


29.9 



• Based on cut out peaks ihat were weighted. 
^ Average of two experiments. 



B. Process Using Urea and Ethanol 

Multivalent antigen-binding proteins were produced from purified 
single-chain proteins in the following way. First the purified single-chain 
protein at a concentration of 0.25-1 mg/ml was dialyzed against 2M urea, 20% 
ethanol (EtOH), and 50mM Tris buffer pH 8.0, for. the times indicated in 
Table 2B. This combination of dissociating agents is thought to disrupt the 
Vl/V„ interface, allowing the Vh of a first single-chain molecule to come into 
contact with a from a second single-chain molecule. Other dissociating 
agents such as isopropanol or methanol should be substitutable for EtOH. 
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Following the initial dialysis, the protein was dialyzed against the load buffer 
for the final HPLC purification step. 

Gel filtration HPLC chromatography (TSK G2000SW column from 
Toyo Soda, Tokyo, Japan) was used to identify and separate monomeric 
single-chain and multivalent antigen-binding proteins. This procedure has 
been described by Fukano, et aL, J. Chromatography 166:47 (1978). 

The results of Example 2B are shown in Table 2B. Table 2B shows 
the percentage of bivalent and other multivalent forms before and after 
treatment with 20% ethanol and urea. Percentages were determined using an 
automatic data integration software package. 

Table 2B 

Summary of the generation of bivalent and higher 
multivalent forms of 
B6.2/212 and CC49/212 proteins using urea and ethanol 



protein 


Time 
(hoars) 


Concentration 
(ni£/inl) 


monomer 


% 

dinier 


trimer 


mixltimers 


66.2 


O 


0.25 


44.1 


37.6 


15.9 


2.4 




0 


1.0 


37.7 


. 33.7 


19.4 


9.4 




3 


0.25 


22.2 


66.5 


11.3 


0.0 




3 


1.0 


13,7 


69.9 


16.4 


0.0 



Example 3 

Determination of Binding Constants 

Three anti-fluorescein single-chain antigen-binding proteins have been 
constructed based on the anti-fluorescein monoclonal antibody 4-4-20. The 
three 4-4-20 single-chain antigen-binding proteins differ in the polypeptide 
linker connecting the Vh and Vl regions of the protein. The three linkers used 
were 202', 212 and 216 (see Table 3). Bivalent and higher forms of the 4-4- 
20 antigen-binding protein were produced by concentrating the purified 
monomeric single-chain antigen-binding protein in the cation exchange load 
buffer (0-06 M MOPS, 0.001 M calcium acetate pH 6.4) to 5 mg/ml. The 
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bivalent and monomeric forms of the 4-4-20 antigen-binding proteins were 
separated by cation exchange HPLC (polyaspartate column) using a 50 min. 
linear gradient between the load buffer (0.06 M MOPS, 0.001 M calcium 
acetate pH 6.4) and a second buffer (0.06 M MOPS, 0.02 M calcium acetate 

5 pH 7.5). Two 0.02 ml samples were separated, and fractions of the bivalent 

and monomeric protein peaks were collected on each run. The amount of 
protein contained in each fraction was determined from the absorbance at 278 
rnn from the first separation. Before collecting the fractions from the second 
separation run. each fraction mbe had a sufficient quantity of 1.03 x 10^ M 

10 fluorescein added to it, such that after the fractions were collected a 1-to-I 

molar ratio of protein-to-fluorescein existed. Addition of fluorescein stabilized 
the bivalent form of the 4-4-20 antigen-binding proteins. These samples were 

kept at 2°C (on ice). 

The fluorescein dissociation rates were determined for each of these 
15 samples following the procedures described by Herron, J.N., in Fluorescence 

Hapten: An Immunological Probe. E.W. Voss, Ed., CRC Press, Boca Raton, 
FL (1984). A sample was first diluted with 20 mM HEPES buffer pH 8.0 to 
5.0 X 10* M 4-4-20 antigen-binding protein. 560 ft\ of the 5.0 x 10"* M 4-4- 
20 antigen-binding protein sample was added to a cuvette in a fluorescence 
20 spectrophotometer equilibrated at 2»C and the fluorescence was read. 140 /tl 

of 1.02 X 10"^ M fluoresceinamine was added to the cuvette, and the 
fluorescence was read every 1 minute for up to 25 minutes (see Table 4). 

The binding constants (K J for the 4-4-20 single-chain antigen-binding 
protein monomers diluted in 20 mM HEPES buffer pH 8.0 in the absence of 
25 fluorescein were also determined (see Table 4). 

The three polypeptide linkers in these experiments differ in length. 
The 202', 212 and 216 linkers are 12, 14 and 18 residues long, respectively. 
These experiments show that there are two effects of linker length on the 4-4- 
20 antigen-binding proteins: first, the shorter the linker length the higher the 
30 fraction of bivalent protein formed; second, the fluorescein dissociation rates 

of the monomeric single-chain antigen-binding proteins are effected more by 
the linker length than are the dissociation rates of the bivalent antigen-binding 
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proteins. With the shorter linkers 202' and 212, the bivalent antigen-binding 
proteins have slower dissociation rates than die monomers. Thus, the linkers 
providing optimum production and binding affinities for monomeric and 
bivalent antigen-binding proteins may be different. Longer linkers may be 
more suitable for monomeric single-chain antigen-binding proteins, and shorter 
linkers may be more suitable for multivalent antigen-binding proteins. 



Table 3 
Linker Designs 




Linker 


v„ 


Linker 
Name 


Reference 


-KLEIE 


GKSSGSGSESKS^ 


TQKLD- 


202' 


Bird et al. 


-KLEIK 


GSTSGSGKSSEGKG^ 


EVKLD- 


212 


Bedzyk et aL 


-KLEIK 


GSTSGSGKSSEGS6STKG' 


EVKIjD- 


216 


This application 


-KLVLK 


GSTSGKPSEGKG^ 


EVKLD- 


217 


This application 



(1) SEQ ID NO. 1 

(2) SEQ ID NO. 2 

(3) SEQ ID NO. 3 

(4) SEQ ID NO. 4 





Table 4 






Effects of Linkers on the SCA Protein Monomers and Dimers 






Linker 






202' 


212 


216 


Monomer 
Fraction 
Ka 

Dissociation rate 


0.47 
0.5 X 10' M ' 
8.2 X 10-^ s » 


0.66 
1.0 X 10* M * 
4.9 X 10-^ s * 


0.90 
1.3 X 10* M ' 
3.3 X 10-' 


Dimer 
Fraction 
Dissociation rate 


0.53 
4.6 X IQ-^ s ' 


0.34 

3.5 X 10^ s' 


0.10 

3.5 X s ' 


Monomer/Dimer 
Dissociation rate ratio 


1.8 


1.4 


0.9 



Example 4 
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bivalent and monomeric forms of the 4-4-20 antigen-binding proteins were 
separated by cation exchange HPLC (polyaspartate column) using a 50 min. 
linear gradient between the load buffer (0.06 M MOPS, 0.001 M calcium 
acetate pH 6.4) and a second buffer (0.06 M MOPS, 0.02 M calcium acetate 

5 pH 7.5). Two 0.02 ml samples were separated, and fractions of the bivalent 

and monomeric protein peaks were collected on each run. The amount of 
protein contained in each fraction was determined fi-om the absorbance at 278 
ran from the first separation. Before collecting the fractions from the second 
separation run. each fraction tube had a sufficient quantity of 1.03 x 10' M 

10 fluorescein added to it, such that after the fractions were collected a 1-to-l 

molar ratio of protein-to-fluorescein existed. Addition of fluorescein stabilized 
the bivalent form of the 4-4-20 antigen-binding proteins. These samples were 

kept at 2°C (on ice). 

The fluorescein dissociation rates were determined for each of these 

15 samples following the procedures described by Herron. J.N., in Fluorescence 

Hapten: An Immunological Probe, E.W. Voss, Ed., CRC Press. Boca Raton. 
FL (1984). A sample was first diluted with 20 mM HEPES buffer pH 8.0 to 
5.0 X 10^ M 4-4-20 antigen-binding protein. 560 ^1 of the 5.0 x lO"* M 4-4- 
20 antigen-binding protein sample was added to a cuvette in a fluorescence 

20 spectrophotometer equilibrated at 2»C and the fluorescence was read. 140 ^1 

of 1.02 X 10^ M fluoresceinamine was added to the cuvette, and the 
fluorescence was read every 1 minute for up to 25 minutes (see Table 4). 

The binding constants (KJ for the 4-4-20 single-chain antigen-binding 
protein monomers diluted in 20 mM HEPES buffer pH 8.0 in the absence of 

25 fluorescein were also determined (see Table 4). 

The three polypeptide linkers in these experiments differ in length. 
The 202', 212 and 216 linkers are 12. 14 and 18 residues long, respectively. 
These experiments show that there are two effects of linker length on the 4-4- 
20 antigen-binding proteins: first, the shorter the linker length the higher the 

30 fraction of bivalem protein formed; second, the fluorescein dissociation rates 

of the monomeric single-chain antigen-binding proteins are effected more by 
the linker length than are the dissociation rates of the bivalent antigen-binding 
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proteins. With the shorter linkers 202' and 212, the bivalent antigen-binding 
proteins have slower dissociation rates than the monomers. Thus, the linkers 
providing optimum production and binding affinities for monomeric and 
bivalent antigen-binding proteins may be different. Longer linkers may be 
more suitable for monomeric single-chain antigen-binding proteins, and shorter 
linkers may be more suitable for multivalent antigen-binding proteins. 



("1 



10 



15 



Table 3 

1 Linker Designs 


1 


Linker 


v„ 


Linker 
Name 


Reference 


-KLEIE 


GKSS6SGSESKS' 


TQKLD- 


202' 


Bird et al. 


-KLEIK 


GSTSGSGKSSEGKG' 


EVKLD- 


212 


Bedzyk et al. 


-KLEIK 


GSTSGSGKSSEGSGSTKG' 


EVKLD- 


216 


This application 


1 -KLVLK 


GSTSGKPSEGKG^ 


EVKLD- 


217 


This application 



(2) SEQ ID NO. 2 

(3) SEQ ID NO. 3 
; (4) SEQ ID NO. 4 







Table 4 






) 


Effects of Linkers on the SCA Protein Monomers and Dimers 








Linker 








202' 


212 


216 


20 


Monomer 
Fracdon 
Ka 

Dissociation rate 


0.47 

0.5 X 10^ 
8.2 X 10*3 s ' 


0.66 
l.Ox Itf* M ' 
4.9 X 10 ^ s * 


0.90 
1.3 X 10^ M ' 
3.3 X 10-^ s ' 


25 


Dimer 
Fraction 

Dissociation rate 


0.53 
4.6 X 10-^ s-» 


0.34 
3.5 X 10-^ s ' 


0.10 
3,5 X 10-^ s-» 




Monomer/Dimer 
Dissociation rate lado 


1.8 


1.4 


0.9 



Example 4 
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Genetic Construction of a Mixed-Fragment Bivalent Antigen- 
Binding Protein 

The genetic constructions for one particular heterobivalent antigen- 
binding protein according to the Rearrangement model are shown in Figures 
lOA and lOB. Figure 1 OA is an amino acid and nucleotide sequence listing 
of the 4-4-20 Vl/212/CC49 Vh construct, coding for a single-chain protein 
with a 4-4-20 Vl. linked via a 212 polypeptide linker to a CC49 Vh. Figure 
lOB is a similar listing showing die CC49 Vl/212/4-4-20 Vg construct, coding 
for a single-chain protein with a CC49 Vl, linked via a 212 linker to a 4-4-20 
Vh- These single-chain proteins may recombine according to the 
Rearrangement model to generate a heterobivalent protein comprising a CC49 
antigen-binding site linked to a 4-4-20 antigen-binding site, as shown in Figure 
5B. 

"4-4-20 Vj," means the variable region of the light chain of the 4-4-20 
mouse monoclonal antibody (Bird, R.E. etal.. Science 242:423 (1988)). The 
number "212" refers to a specific 14-residue polypeptide linker that links die 
4-4-20 Vl and the CC49 V„. See Bed^k. et al., J. Biol. Chem. 

265:18615-18620 (1990). "CC49 Vh" is the variable region of the heavy 
chain of tiie CC49 antibody, which binds to tiie TAG-72 antigen. The CC49 
antibody was developed at The National Instimtes of Health by Schlom, et al. 
Generation and Characterization of B72.3 Second Generation Monoclonal 
Antibodies Reactive Wth The Tumor-assodated Glycoprotein 72 Antigen. 
Cancer Research 48:4588-4596 (1988). 

Insertion of the sequences shown in FIGS. lOA and lOB, by standard 
recombinant DNA methodology, into a suitable plasmid vector will enable one 
of ordinary skill in tiie art to transform a suitable host for subsequent 
expression of the single-chain proteins. See Maniatis et al.. Molecular 
Cloning, A Laboratory Manual, p. 104, Cold Spring Harbor Laboratory 
(1982), for general recombinant techniques for accomplishing the aforesaid 
goals; see also U.S. Patent 4,946,778 (Ladner et al.) for a complete 
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description of methods of producing single-chain protein molecules by 
recombinant DNA technology. 

To produce multivalent antigen-binding proteins from the two single- 
chain proteins, 4-4-20Vl-212/CC49Vh and CC49Vl/21 2/4-4-20 Vh, the two 
single-chain proteins are dialyzed into 0.5 M GuHCl/20% EtOH being 
combined in a single solution either before or after dialysis. The multivalent 
proteins are then produced and separated as described in Example 2. 



Example 5 

Preparation of Multivalent 
Antigen-Binding Proteins by Chemical Cross-Linking 

Free cysteines were engineered into the C-terminal of the 4-4-20/212 
single-chain antigen-binding protein, in order to chemically crosslink the 
protein. The design was based on the hinge region found in antibodies 
betwieen the ChI and Ch2 regions. In order to try to reduce antigenicity in 
humans, the hinge sequence of the most common IgG class, IgGl, was 
chosen. The 4-4-20 Fab structure was examined and it was determined that 
the C-termihal sequence GluH216-ProH217-ArgH218, was part of the ChI 
region and that the hinge between ChI and Ch2 starts with ArgH218 or 
GlyH219 in the mouse 4-4-20 IgG2A antibody. Figure 14 shows the structure 
of a human IgG. The hinge region is indicated generally. Thus the hinge 
from human IgGl would start with LysH218 or SerH219. (See Table 5). 

The C-terminal residue in most of the single-chain antigen-binding 
proteins described to date is the amino acid serine. In the design for the hinge 
region, the C-terminal serine in the 4-4-20/212 single-chain antigen-binding 
protein was made the first serine of the hinge and the second residue of the 
hinge was changed from a cysteine to a serine. This hinge cysteine normally 
forms a disulfide bridge to the C-terminal cysteine in the light chain. 



wo 93/11161 V 9 PCr/US92/09965 

-36- 
TABLE5 

218 

IgG2A mouse' EPRGPTIKP CPPCLC- 

5 IgGl human* AEPK SCDKTHTCPPC- 

•SCA*' - - V T V S 

SCA* Hinge design 1« --VTVSSDKTHTC 
SCA* Hinge design 2* --VTVSSDKTHTCPPC 

* - single-chain antigen-binding protein 

10 (1) SEQ ID NO. 5 

(2) SEQ ID NO. 6 

(3) SEQ ID NO. 7 

(4) SEQ ID NO. 8 

(5) SEQ ID NO. 9 

15 There are possible advantages to having two C-terminal cysteines, for 

they might form an intramolecular disulfide bond, making the protein recovery 
easier by protecting the sulfurs from oxidation. The hinge regions were added 
by introduction of a BstE II restriction site in the 3'-tenninus of the gene 
encodingthe4-4-20/212single-chainantigen-binding protein (see Figures 15A- 

20 15B). 

The monomeric dngle-diain antigen-binding protein containing the C- 
terminal (qrsteine can be purified using Ihe normal methods of purifying a 
single-chain antigen-binding protdns, widi minor modifications to protect the 
firee sulfhydryls. The cross-linldng could be accomplished in one of two 

25 ways. First, the purified single-chain antigen-binding protein could be treated 

witii a mild reducing agent, such as dithiothreitol, then allowed to air oxidize 
to form a disulfide-bond between the individual single-chain antigen-binding 
proteins. This type of chemistry has been successfiil in producing 
heterodimers from whole antibodies (Nisonoff et al.. Quantitative Estimation 

30 of the Hybridization of Rabbit Antibodies, Nature 4826:355-359 (1962); 

Brennan et al.. Preparation of Bispecific Antibodies by Chemical 
Recombination of Monoclonal Immunoglobulin G, Fragments, Science 229:&1- 
83 (1985)). Second, chemical crosslinking agents such as Z>wmaleimidehexane 
could be used to cross-link two single-chain antigen-binding proteins by their 

35 C-terminal cysteines. §ee Partis et al., 7. Prot. Chem. 2:263-277 (1983). 
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Example 6 



Genetic Construction of Bivalent Antigen-Binding Proteins 

Bivalent antigen-binding proteins can be constructed genetically and 
subsequently expressed in E. coli or other known expression systems. This 
can be accomplished by genetically removing the stop codons at the end of a 
^'cne encoding a monomeric single-chain antigen-binding protein and inserting 
3 hnkrr ard a gene encoding a second single-chain antigen-binding protein. 
\\c have constructed a gene for a bivalent CC49/212 antigen-binding protein 
:n :his manner (see Figure 16). The CC49/212 gene in the starting expression 
plasmid is in an Aat II to Bam HI restriction fragment (see Bird et al,, Single- 
^ Chain Antigen-Binding Proteins, Science 242:423-426 (1988); and Whitlow 
ct al.. Single-Chain Fy Proteins and Their Fusion Proteins, Methods 2:97-105 
(1991)). The two stop codons and the Bam HI site at the C-terminal end of 
ihe CC49/212 antigen-binding protein gene were replaced by a single residue 
hnker (Ser) and an Aat II restriction site. The resulting plasmid was cut with 
Aat JI and the purified Aat II to Aat II restriction fragment was ligated into 
Aai II cut CC49/212 single-chain antigen-binding protein expression plasmid. 
The resulting bivalent CC49/212 single-chain antigen-binding protein 
expression plasmid was transfected into an E, coli expression host that 
contained the gene for the cI857 temperature-sensitive repressor. Expression 
of single-chain antigen-binding protein in this system is induced by raising the 
temperature from 30^C to 42'^C. Fig. 17 shows the expression of the divalent 
CC49/212 single-chain antigen-binding protein of Fig. 16 at 42''C, on an SDS- 
PAGE gel containing total E. coli protein. Lane 1 contains the molecular 
weight standards. Lane 2 is the uninduced E. coli production strain grown at 
SO^'C. Lane 3 is divalent CC49/212 single-chain antigen-binding protein 
induced by growth at 42**C. The arrow shows the band of expressed divalent 
CC49/212 single-chain antigen-binding protein. 
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Example 7 



Construction, Purification, and Testing of 4-4-20/CC49 
Hetewdimer Fy With 217 Linkers. 

The goals of this experiment were to produce, purify and analyze for 
5 activity a new heterodimer Fv that would bind to both fluorescein and the pan- 

carcinoma antigen TAG-72. The design consisted of two polypeptide chains, 
which associated to form the active heterodimer Fv. Each polypeptide chain 
can be described as a mixed single-chain Fv (mixed sFv). The first mixed sFv 
(GX 8952) comprised a 4-4-20 variable light chain (Vl) and a CC-49 variable 

10 heavy chain (Vh) connected by a 217 polypeptide linker (Figure 19A). The 

second mixed sFv (GX 8953) comprised a CC-49 Vl and a 4-4-20 Vh 
connected by a 217 polypeptide linker (Figure 19B). The sequence of the 217 
polypeptide linker is shown in Table 3. Construction of analogous CC49/4-4- 
20 heterodimers connected by a 212 polypeptide linker as desraibed in 

15 Bcample 4. 

Results 



A. Purification 4 .) 

One 10-Iiter fermentation of each mixed sFv was grown on casein 
digest-glucose-salts medium at 32'*C to an optical density at 600 nm of 15 to 

20 20. The mixed sFv expression was induced by raising the temperature of the 

fermentation to 42''C for one hour. 277gm (wet cell weight) of E. coU strain 
GX 8952 and 233gm (wet cell weight) of E. coli strain GX 8953 were 
harvested in a centrifuge at 7000g for 10 minutes. The cell pellets were kept 
and the supemate discarded. The cell pellets were frozen at -20''0C for ^ 

25 storage. 
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2.55 liters of "lysis/wash buffer" (50mM Tris/ 200mM NaCl/ 1 mM 
EDTA, pH 8.0) was added to both of the mixed sFv's cell pellets, which were 
previously thawed and combined to give SlOgm of total wet cell weight. After 
complete suspension of the cells they were then passed through a Gaulin 
homogenizer at 9000psi and 4^C. After this first pass the temperature 
increased to 23''C. The temperature was immediately brought down to 0°C 
using dry ice and methanol. The cell suspension was passed through the 
Gaulin homogenizer a second time and centrifiiged at 8000 rpm with a Dupont 
GS-3 rotor for 60 minutes. The supernatant was discarded after centrifugation 
and the pellets resuspended in 2.5 liters of "lysis/wash buffer" at 4'*C. This 
suspension was centrifiiged for 45 minutes at 8000 rpm with the Dupont GS-3 
rotor. The supernatant was again discarded and the pellet weighed. The 
pellet weight was 136.1 gm. 

1300ml of 6M GuanidineHydrochloride/50mM Tris/50mM KCl/lOmM 
CaCUpH 8.0 at 4'*C was added to the washed pellet. An overhead mixer was 
used to speed solubilization. After one hour of mixing, the heterodimer 
GuHCI extract was centrifuged for 45 minutes at 8000 rpm and the pellet was 
discarded. The 1425ml of heterodimer Fv 6M GuHCl extract was slowly 
added (16 ml/min) to 14.1 liters of "Refold Buffer" (50mM Tris/50mM 
KCI/lOmM CaClj, pH 8.0) under constant mixing at 4'*C to give an 
approximate dilution of 1:10. Refolding took place overnight at 4°C. 

After 17 hours of refolding the anti-fluorescein activity was checked by 
a 40% quenching assay, and the amount of active protein calculated. 150mg 
total active heterodimer Fv was found by the 40% quench assay, assuming a 
54,000 molecular weight. 

4 liters of prechilled (4*'C) 190 proof ethanol was added to the 15 liters 
of refolded heterodimer with mixing for 3 hours. The mixture sat overnight 
at 4**C. A flocculent precipitate had settled to the bottom after this overnight 
treatment. The nearly clear solution was filtered through a Millipak-200 
(0.22/z) filter so as to not disturb the precipitate. A 40% quench assay 
showed that 10% of the anti-fluorescein activity was recovered in the filtrate. 
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The filtered sample of heterodimer was dialyzed, using a Pellicon 
system containing 10,000 dalton MWCO membranes, with "dialysis buffer" 
40mM MOPS/0.5mM Calcium Acetate (CaAc), pH 6.4 at 4»C. 20 liters of 
dialysis buffer was required before the conductivity of the retentate was equal 
to that of the dialysis buffer (~500/tS). After dialysis the heterodimer sample 
was filtered through a MilIipak-20 filter, 0.22ai. After this step a 40 % quench 
assay showed there was 8.8 mg of active protein. 

The crude heterodimer sample was loaded on a Poly CAT A cation 
exchange column at 20ml/min. The column was previously equilibrated with 
60mM MOPS. 1 mM CaAc pH 6.4, at 4°C, (Buffer A). After loading, the 
column was washed witfi 150ml of "Buffer A" at 15ml/nun. A 50min linear 
gradient was performed at 15ml/min using "Buffer A" and ^Buffer B" (60mM 
MOPS. 20mM CaAc pH 7.5 at 4»Q. The gradient conditions are presented 
in Table 6. "Buffer C" comprises 60mM MOPS, lOOmM CaClj, pH 7.5. 



15 



20 



25 



Table 6 


Time 


%A 


%B 


%c 


Flow 


0:00 


100.0 


0.0 


0.0 


15ml/nun 


50:00 


0.0 


100.0 


0.0 


15nil/min 


52:00 


0.0 


100.0 


0.0 


15jnl/min 


54:00 


0.0 


0.0 


100.0 


ISml/min 


58:00 


0.0 


0.0 


100.0 


15inl/min 


60:00 


100.0 


0.0 


0.0 


15ml/min 



Approximately 50ml fractions were collected and analyzed for activity, 
purity, and molecular weightby size-exclusion chromatography. The fractions 
were not collected by peaks, so contamination between peaks is likely. 
Fractions 3 through 7 were pooled (total volume - 218ml), concentrated to 
50ml and dialyzed against 4 liters of 60mM MOPS, 0.5mM CaAc pH 6.4 at 
4°C overnight. The dialyzed pool was filtered through a 0.22/4 filter and 
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checked for absorbance at 280nm. The filtrate was loaded onto the PolyCAT 
A column, equilibrated with 60mM MOPS, 1 mM CaAc pH 6.4 at 4^C, at a 
flow rate of lOml/min. Buffer B was changed to 60mM MOPS, lOmM CaAc 
pH 7.5 at 4''C. The gradient was run as in Table 6. The fractions were 
collected by peak and analyzed for activity, purity, and molecular weight. 
The chromatogram is shown in Figure 20. Fraction identification and analysis 
is presented in Table 7. 



Table 7 

Fraction Analysis of ttie Heterodimer Fv protein 



Fraction 
No. 


Ajgo reading 


Total Volume 
(ml) 


HPLC-SE Elution Time 
(min) 


2 


0.161 


36 


20.525 


3 


0.067 


40 




4 


0.033 


40 




5 


0.178 


45 


19.133 


6 


0.234 


50 


19.163 


7 


0.069 


50 




8 


0.055 


40 





Fractions 2 to 7 and the starting material were analyzed by SDS gel 
electrophoresis, 4-^20%. A picture and description of the gel is presented in 
Figure 21. 



B. HPLC Size Exclusion Results 



25 



Fractions 2, 5, and 6 correspond to the three main peaks in Figure 20 
and therefore were chosen to be analyzed by HPLC size exclusion. Fraction 
2 corresponds to the peak that runs at 21.775 minutes in the preparative 
purification (Figure 20), and runs on the HPLC sizing column at 20.525 
minutes, which is in the monomeric position (Figure 22 A). Fractions 5 and 
6 (30.1 and 33.455 minutes, respectively, in Figure 20) run on the HPLC 
sizing column (Figures 22B and 22C) at 19.133 and 19.163 minutes. 
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respecdvely (see Table 7). Therefore, both of these peaks could be considered 
dimers. 40% Quenching assay? were performed on all fractions of this 
purification. Only fraction 5 gave significant activity. 2.4 mg of active CC49 
'4-4-20 heterodimer Fv was recovered in fraction 5. based on the Scatchard 
5 analysis described below. 

C. N-teiminal sequencing of the fractions 

The active heterodimer Fv fraction should contain both polypeptide 
chains. N-terminal sequence analysis showed that fractions 5 and 6 displayed 
N-terminal sequences consistent with the prescence of both CC49 and 4-4-20 
10 polypeptides and fraction 2 displayed a single sequence corresponding to the 

CC49/212/4-4-20 polypeptide only. We believe that fraction 6 was 
contaminated by fraction 5 (see Figure 20), since only fraction 5 had 
significant activity. 



15 



20 



25 



D, Anti-fluorescein activity by Scatchard analysis 
The fluorescein association constants (Ka) were determined for 
fractions 5 and 6 using the fluorescence quenching assay described by Herron, 
J.N.. in Fluorescence Hapten: An Immunological Probe, E.W. Voss, ed., 
CRC Press, Boca Raton, FL (1984). Each sample was diluted to 
approximately 5.0 x 10^ M witii 20 mM HEPES buffer pH 8.0. 590 ,d of the 
5.0 X 10^ M sample was added to a cuvette in a fluorescence 
spectrophotometer equilibrated at room temperature. In a second cuvette 590 
^1 of 20 mM HEPES buffer pH 8.0 was added. To each cuvette was added 
10 ii\ of 3.0 X 10-' M fluorescein in 20 mM HEPES buffer pH 8.0, and die 
fluorescence recorded. This is repeated until 140 ^1 of fluorescein had been 
added. The resulting Scatchard analysis for fraction 5 shows a binding 
constant of 1.16 x 10' M"' for fraction #5 (see Figure 23). This is very close 
to the 4-t-20/212 sFv constant of 1.1 x 10' M'^ (see Pantoliano et al. 
Biochemistry iO: 101 17-10125 (1991)). The R intercept on tiie Scatchard 
analysis represents the fraction of active material. For fraction 5, 61 % of the 
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material was active. The graph of the Scatchard analysis on fraction 6 shows 
a binding constant of 3.3 x 10* M*^ and 14% active. The activity that is 
present in fraction 6 is most likely contaminants from fraction 5. 

E. Anti-TAG'72 activity by competition EUSA 

The CC49 monoclonal antibody was developed by Dr. Jeffrey Schlom's 
group. Laboratory of Tumor Immunology and Biology, National Cancer 
Institute. It binds specifically to the pan-carcinoma tumor antigen TAG-72. 
See Muraro, R., et al.. Cancer Research 45:4588-4596 (1988). 

To determine the binding properties of the bivalent CC49/4-4-20 Fv 
(fraction 5) and the CC49/212 sFv, a competition enzyme-linked 
immunosorbent assay (EUSA) was set up in which a CC49 IgG labeled with 
biotin was competed against unlabeled CC49/4-4-20 Fv and the CC49/212 sFv 
for binding to TAG-72 on a human breast carcinoma extract (see Figure 24). 
The amount of biotin-labeled CC49 IgG was determined using a preformed 
complex with avidin and biotin coupled to horse radish peroxidase and O- 
^phenylenediamine dihydrochloride (OPD). The reaction was stopped with 4N 
H2SO4 (sulfuric acid), after 10 min. and the optical density read at 490nm. 
This competition EUSA showed that the bivalent CC49/4-4-20 Fv binds to the 
TAG-72 antigen. The CC49/4-4-20 Fv needed a two hundred-fold higher 
protein concentration to displace the IgG than the single-chain Fv, 



We have chemically crosslinked dimers of 4-4-20/212 antigen-binding 
protein with the two cysteine C-terminal extension (4-4-20/212 CPPC single- 
chain antigen-binding protein) in two ways. In Example 5 we describe the 
design and genetic construction of the 4-4-20/212 CPPC single-chain antigen- 
binding protein (hinge design 2 in Table 5). Figure 15B shows the nucleic 
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acid and protein sequences of this protein. After purifying the 4-4-20/212 
CPPC single-chain antigen-binding protein, using the methods described in 
Whitlow and Filpula, Meth. Enzymol. 2:97 (1991), dimers were formed by 
two methods. First, the free cysteines were mildly reduced with dithiothreitol 
(pTT) and then the disulfide-bonds between the two molecules were allowed 
to form by air oxidation. Second, the chemical ciosslinker bis- 
maleimidehexane was used to produce dimers by crosslinking the free 
cyst&n&s from two 4-4-20/212 CPPC angle-chain antigen-binding proteins. 

A O.I mg/ml solution of the 4-4-20/212 CPPC single-chain antigen- 
binding protein was mildly reduced using 1 mM DTT, 50 mM HEPES, 50mM 
NaCl, 1 mM EDTA buffer pH 8.0 at 4°C. The samples were dialyzed against 
50mM HEPES, 50 mM NaCI. 1 mM EDTA buffer pH 8.0 at 4°C overnight, 
to allow the oxidation of free sulfhydrals to intermolecular disulfide-bonds. 
Hguie 25 shows a non-reducing SDS-PAGE gel after the air oxidation; it 
shows that ^proximately 10% of the 4-4-20/212 CPPC protein formed dimers 
with mjlecular weights around 55,000 Daltons. 

A 0.1 mg/ml solution of the 4-4-20/212 CPPC single-chain antigen- 
binding protein was treated with 2 mM to-maleimidehexane. Unlite forming 
a disulfide-bond between two free cysteines in the previous example, the bis- 
maleimidehexane crosslinker material should be stable to redudng agents such 
as jS-mercaptoetfianol. Figure 26 shows that approximately 5% of the treated 
material produced dimer with a molecular weight of 55,000 Daltons on a 
reducing SDS-PAGE gel (samples were treated with iS-merciq)taIethanol prior 
to being loaded on tiie gel). We further purified the *w-nialeimidehexane 
treated 4-4-20/212 CPPC protein on PolyCAT A cation exchange column after 
the protein had been extenavely dialyzed against buffer A. Figure 26 shows 
that we were able to enhance the ftaction containing the dimer to 
^proximately 15%. 
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Conclusions 

We have produced a heterodimer Fv from two complementary mixed 
sFv's which has been shown to have the size of a dimer of the sFv's. The N- 
terminal analysis has shown that the active heterodimer Fv contains two 
5 polypeptide chains. The heterodimer Fv has been shown to be active for both 

fluorescein and TAG-72 binding. 

All publications cited herein are incorporated fully into this disclosure 
by reference. 

) From the foregoing it will be appreciated that, although specific 

10 embodiments of the invention have been described herein for purposes of 

illustration, various modifications may be made without deviating from the 
spirit and scope of the invention and the following claims. As examples, the 
steps of the preferred embodiment constitute only one form of carrying out the 
process in which the invention may be embodied. 



) 



931 1161 A1 I > 



wo 93/11161 W W PCT/US92/0W65 

-46- 
SEQUENCE LISTING 

(1) GENERAL INPORl&TION: 

(i) APPLICANT: Whitlow, Marc 
Wood, James F- 
Hardman, Karl 
Bird, Robert 
Filpula, David 
Rollence, Michele 

(ii) TITLE OP INVENTION: Multivalent Antigen- Binding Proteins 
(iii) NUMBER OF SEQUENCES: 23 

(iv) CORRESPONDENCE ADDRESS: «_ij«i-«4„ Pox 

(A) ADDRESSEE: Sterne, Kessler, Goldstexn 6 Fox 

(B) STREET: 1225 Connecticut Avenue 

(C) CITY: Washington 

(D) STATE: D.C. 

(B) COUNTRY: U.S.A. f 
(P) ZIPr 20036 i 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy d^-sk 

(B) COMPUTER: IBM PC co^P^tible 

C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0, Version ffl.25 

(Vi) CURRENT APPLICATION DATA: a^^a\ 

(A) APPLICATION NUMBERs (to be assigned) 

(B) PILING DATE: Herewith 

(C) CLASSIFICATION: 

(Vix) PRIOR APPLICATION DATA: 

^ ' (A) APPLICATION NUMBER: US 07/796,936 
(B) PILING DATE: 25-NOV-1991 

fviii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Goldstein, Jo^g® A. 

[I! ^SScS"^;"""-"""" 

(ix) TELBCCM4UNICATION INFORMATION: 

(A) TELEPHONE: (202) 833^7533 

(B) TELEFAX: (202) 833-8716 J 

(2) INFORMATION FOR SEQ ID NO:l: 

• (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: both 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
Gly Lys Ser Ser Gly Ser Gly Ser Glu Ser Lys Ser 

(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: r 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: both 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Gly Ser Thr Ser Gly Ser Gly Lys Ser Ser Glu Gly Lys Gly 
1 ^ 5 10 
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(2) INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: both 



^pp^/US92/09965 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

Oly ser Thr Ser Gly Ser Gly Lys Ser. Ser Glu Gly Ser Gly Ser Thr 



1 5 



Lys Gly 

(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: both 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
Gly ser Thr Ser Gly Lys Pro Ser Glu Gly Lys Gly 
1 ^ 
(2) INFORMATION FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: both 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 

Clu Pro Arg Gly Pro Thr lie Lys Pro Cys Pro Pro Cys Leu Cys 



1 5 



(2) INFORMATION FOR SEQ ID NO:6: ' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: IS amino acxds 



i-ij-.^' — 

(B) TYPE: amino acio 
(D) TOPOLOGY: both 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

Ala Glu Pro Lys Ser Cys Asp Lys Thr His Thr Cys Pro Pro Cys 



1 5 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 amino acids 



(B) TYPE: amino acid 
(D) TOPOLOGY: both 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 



Val Thr Val Ser 
1 
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(2) INBORMATION^ FOR SEQ ID NO: 8 = 

(i) SEQUENCE CHARACTERISTICS- 

(B) TYPE: amino acid 
•(D) TOPOLOGY r both 

(xi) SEQUENCE DESCRIPTION: SEQ ID K0t8: 
Val Tbr Val Ser Ser Asp Lye Tte His Thr Cyc 



10 



(2) INFORMATION FOR SEQ ID NO :9 s 

(i) SEQUENCE CHARACTERISTICS : 

CA> USNGTHs 14 amino acids 
(B> TYPE: amino acid 
(D) TOPOLOGYs both 



Od) SEQUENCE DESCRIPTIONS SEQ ID NO:9s 
Val ^ Val Sex Sear Asp I,ys Thr Hio Thr eye Pro Pro Cya 
(2) INFORHATION FOR SEQ ID NOslOs 

(i) SEQUENCE CHARACTERISTICS 2 

(A) IiENGTH: 731 base pairs 

(B) TYPBs nucleic acid 
(C> STRANDEDNBSSs both 
(D) TOPOLOGYs both 

Clx} FEATf'RE: 

CA) KAME/KBYs CDS 
(B) LOCATION: 1..729 

(xi) SBQUENCE -DESCRIPTION: SEQ ID NOrlO: 

40 45 



48 



96 



75 



80 



85 90 95 

105 xiO 
120 125 



144 



192 



240 



288 



336 



384 
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432 



576 



AAT GAG AGG TTC 3AG GGC 

- 12 S ^ iSJ S S ^/tS r„ - -v «>r 

r-rr T2LT TTC TGT ACA AGA €72 

- j:? s S ^ - - - i - - 

TXA TAO C*T CC 
• • A*p 

ta) niPORMATlON FOR SEO ID NOsll: 

li) SEQOBNCE CHARACTERISTICS; 

(A) tEHGriHt 243 amino acids 

(B) TYPE: amino add 
(D) TOPOUOCT: linear 

(ii) MOLBCOLS WPBi protein 

ixi) SEQOEHCB DBSCRlPTIOBs SBQ ID HOtll. 
^ VaX He. OXn pro ser X.- pro VaX Se. X.U Cly 

^ cm ser ix! ser C^ Ar. Ser Ser CXn Ser X.U VaX Hie Ser 

^ CXy A-n X ..Vr AT. TTP Tyr CXn .ye ^ CXV CXn Ser 
^ .ye X-u XXe ^ I.V» VaX Ser A-n Ar, P.e Ser CXy VaX Pro 

^ P.e ser CXy Ser CXy Ser CXy l-| P>»e T.^ I-u X.y- iXe 

Z Xr, Val GXU XXa CXu Asp CXy VaX Tyr P^e Cye Ser GXn Ser 

^ Hi. VaX P- ^ IVs ^ To 

CXy ser THr s!I CXy Ser CXy X.ye Ser Ser CXu CXy .ye CXy CXn VaX 
US , 
, Glu I*u VaX Lye Pro CXy AXa Ser VaX 

Cla 1-eu GXn eXn Ser Aep AXa CXu I*u vai y 

^ Z s..cy.». 5|; s.r «r TV' SI ^ la 
Z v.. ur. .jn - - - «5 
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Phe Ser Pro Gly As„ Asp Asp Phe Lys Tyr Asn Glu Arg Phe t.ys Gly 



I-ys Ala Thr Leu Thr Ala Asp Lya Ser Ser Ser Thr Ala Tyr Val Gin 

200 205 

Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr Phe Cya Thr Arg 



ser Leu Asn Met Ala Tyr Trp Gly Gin Gly Thr Ser Val Thr Val Ser 

235 240 

* Asp 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS - 

(A) LENGTH: 744 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: boch 

(D) TOPOLOGY 5 both 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 744 

(aci) SEQUENCE DESCRIPTION: SEQ ID NOti2. 

^ S SS S IS SJS 12 S 12 ^ S 5- - IS 

^« SI S S J2 ^ ^ 12 52 J2 S J2 
25 Si? ^ ?5 S ^ S 1^ i° 

i2 ^ ^ S SS ?II IS S 12 SI its =4' 12 §g 52 



60 



S SI S 0^ 52 12 ?S 2S IS ?2 12 

70 75 



52 52 ^ J£ §; Si ss s ?s ^ ^ 
SI 52 S SS J2 15 12 o=S ?S ^= 21 v4' SS 

105 no 

12 is 12 SI 52 jj; 12 12 as g ^ ^ §J 

?S ^ 25 SI ?2 SJ 25 5S Si! S =^ its ?2 

135 2.40 

ATG AAA CTC TCC TGT GTT GCC TCT GGA TTC ACT TTT AGT GAC TAC TGG 
Mec I,ys Leu Ser Cys Val Ala Ser Gly Phe Thr Asp ^ S 

iss xso 

M^t ^5 ^'^^ S^f f °^ '^'^ '^"^ <»S AAA GGA CTG GAG TGG GTA GCA 

Met Asn Trp Val Arg Gin Ser Pro Glu Lye Gly Leu Glu vll 

CAA ATT A6A AAC AAA CCT TAT AAT TAT GAA ACA TAT TAT TCA GAT TCT 
Gin He Arg Asn Lys Pro Tyr Asn Tyr Glu Thr ^ Asp 

-Low 185 290 
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9S 



144 



192 



240 



268 



336 



364 



432 



480 



528 



576 
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GTG AAA GGC AGA TTC ACC ATC TCA AGA GAT GAT TCC AAA AGT AGT GTC 624 
Val Lys Gly Arg Phe Thr lie Ser Arg Asp Asp Ser Lys Ser Ser Val 
195 - 200 205 

TAG CTG CAA ATG AAC AAC TTA AGA GTT GAA GAC ATG GGT ATC TAT TAG 672 
Tyr Leu Gin Met Asn Asn Leu Arg Val Glu Asp Met Gly He Tyr Tyr 
210 " 215 220 

TGT ACG GGT TCT TAG TAT GGT ATG GAC TAC TGG GGT CAA GGA ACC TCA 72 0 

Cys Thr Gly Ser Tyr Tyr Gly Met Asp Tyr Trp Gly Gin Gly Thr Ser 
225 230 235 240 

^ GTC ACC GTC TCC TAA TAA GGA TCC 744 
Val Thr Val Ser * Gly Ser 

245 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECOLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Asp Val Val Met Ser Gin Ser Pro Ser Ser Leu Pro Val Ser Val Gly 

1. 5 10 15 

Glu Lys Val Thr Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser 
20 25 30 

Gly Asn Gin Lys Asn Tyr Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin 

35 40 45 

Ser Pro Lys Leu Leu He Tyr Trp Ala Ser Ala Arg Glu Ser Gl/ Val 
50 55 6 0 

Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Ser 

65: ^ 70 75 80 

He Ser Ser Val Lye Thr Glu Asp Leu Ala Val Tyr Tyr Cys Gin Gin 
85 90 95 

Tyr Tyr Ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu Val Leu 
100 105 110 

Lys Gly Ser Thr Ser Gly Ser Gly Lys Ser Ser Glu Gly Lys Gly Glu 
115 120 125 

Val Lys Leu Asp Glu Thr Gly Gly Gly Leu Val Gin Pro Gly Arg Pro 
130 135 14 0 

Met Lys Leu Ser Cys Val Ala Ser Gly Phe Thr Phe Ser Asp Tyr Trp 
145 150 155 160 

Met Asn Trp Val Arg Gin Ser Pro Glu Lys Gly Leu Glu Trp Val Ala 
165 170 175 

Gin He Arg Asn Lys Pro Tyr Asn Tyr Glu Thr Tyr Tyr Ser Asp Ser 
180 185 190 

Val Lys Gly Arg Phe Thr He Ser Arg Asp Asp Ser Lys Ser Ser Val 

195 200 205 

Tyr Leu Gin Met Asn Asn Leu Arg Val Glu Asp Met Gly He Tyr Tyr 
210 215 220 

Cys Thr Gly Ser Tyr Tyr Gly Met Asp Tyr Trp Gly Gin Gly Thr Ser 
225 230 235 240 

Val Thr Val Ser • • Gly Ser 
245 
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(2) IMFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) -LENGTH: 761 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : both 

(D) TOPOLOGY: both 



(LX) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..756 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOil4: 



CAC GTC GTT ATG ACT CAG ACA CCA CTA TCA CTT CCT GTT AGT CTA GGT 48 
JL«P V«l Val Met Thr Gin Thr Pro Leu Ser Leu Pro Val Ser Leu Gly 
1 5 10 15 

CAT CAA CCC TCC ATC TCT TGC AGA TCT AGT CAG AGC CTT GTA CAC AGT 
A«p cm Al* Ser lie Ser Cys Arg Ser Ser Gin Ser Leu Val His Ser 

^ 25 30 



20 



ATT AGA AAC AAA CCT TAT AAT 1£^l WiA awv x#kx *w« . — 

^11 Arg Asn Lys Pro Tyr Asn Tyr Glu Thr Tyr Tyr Ser Asp Ser Val 
185 l'" 

AAA GGC AGA TTC ACC ATC TCA A«iA GAT GAT TCC AAA AGT AGT — ^- 
Lys Gly Arg Phe Thr lie Ser Arg Asp Asp Ser Lys Ser Ser Val Tyr 

' ' - 200 205 



96 



AA-- CCA AAC ACC TAT TTA CGT TGG TAC CTG CAG AAG CCA GGC CAG TCT 144 
A.A Gly A.n Thr Tyr Leu Arg Trp Tyr Leu Gin Lys Pro Gly Gin Ser 
35 40 45 

CCA AAC GTC CTG ATC TAC AAA GTT TCC AAC CGA TTT TCT GGG GTC CCA 192 
Pro Lv« Val Leu. He Tyr Lys Val Ser Asn Arg Phe Ser Gly Val Pro 
50 55 «0 

CAC ACC TTC AGT GGC AGT GGA TCA GGG ACA GAT TTC ACA CTC AAG ATC 240 
Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Lys lie 
65 70 75 80 

ACC ACA CTG GAG GCT GAG GAT CTG GGA GTT TAT TTC TGC TCT CAA AGT 288 
Ser Arg Val Glu Ala Glu Asp Leu Gly Val Tyr Phe Cys Ser Gin Ser 
85 90 

ACA CAT GTT CCG TGG ACG TTC GGT GGA GGC ACC AAG CTT GAA ATC AAA 335 
Thr His Val Pro Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lys 
100 105 • 110 

CCT TCT ACC TCT GGT TCT GGT AAA TCT TCT GAA GGT AAA GGT GAA GTT 384 
Cly Ser Thr Ser Gly Ser Gly Lys Ser Ser Glu Gly Lys Gly Glu Val 
115 120 125 

AAA CTG GAT GAG ACT GGA GGA GGC TTG GTG CAA CCT GGG AG6 CCC ATG 432 
Lys Leu Asp Glu Thr Gly Gly Gly Leu Val Gin Pro Gly Arg Pro Mec 
130 135 140 

AAA CTC TCC TGT GTT GCC TCT GGA TTC ACT TTT AGT GAC TAC TGG ATG 480 
Lys Leu Ser Cys Val Ala Ser Gly Phe Thr Phe Ser Asp Tyr Trp Met 
145 150 155 1€0 

AAC TOG CTC CGC CAG TCT CCA GAG AAA GGA CTG GAG TGG GTA GCA CAA 528 
Asn T™ Val Arg Gin Ser Pro Glu Lys Gly Leu Glu Trp Val Ala Gin 
165 170 

ATT AGA AAC AAA CCT TAT AAT TAT G?A ffA TAT TCA GAT TCT GTG 576 

Lys 
180 

AAA GGC AGA TTC ACC ATC TCA AGA GAT GAT TCC AAA AGT AGT GTC TAC 624 

Arg 
195 

CTC CAA ATG AAC AAC TTA AGA GTT GAA GAC ATG GGT ATC TAT TAC TGT 672 
Leu Gin Met Asn Asn Leu Arg Val Glu Asp Mec Gly He Tyr Tyr Cys 
210 215 220 

ACC GGT TCT TAC TAT GGT ATG GAC TAC TGG GGT CAA GGA ACC TC6 GTC 720 
Thr Gly ser Tyr Tyr Gly Mec Asp Tyr Trp Gly Gin Gly Thr Ser Val 
225 230 235 
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ACC GTC TCC ACT GAT AAG ACC CAT ACA TGC TAA TAGGATCC 761 
Thr Val Ser S»r Asp Lys Thr His Thr Cys • 
245 250 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 251 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Asp Val Val Met Thr Gin Thr Pro Leu Ser Leu Pro Val Ser Leu Gly 
15 10 15 

Asp Gin Ala Ser lie Ser Cys Arg Ser Ser Gin Ser Leu Val His Ser 
20 25 30 

Asn Gly Asn Thr Tyr Leu Arg Trp Tyr Leu Gin Lys Pro Gly Gin Ser 
35 40 45 

Pro Lys Val Leu lie Tyr Lys Val Ser Asn Arg Phe Ser Gly Val Pro 

50 55 60 

Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Lys lie 
65 70 75 80 

Ser Arg Val Glu Ala Glu Asp Leu Gly Val Tyr Phe Cys Ser Gin Ser 
85 90 95 

Thr His Val Pro Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu lie Lys 
100 105 110 

Gly Ser Thr Ser Gly Ser Gly Lys Ser Ser Glu Gly Lys Gly Glu Val 
115 120 125 

Lys Leu Asp Glu Thr Gly Gly Gly Leu Val Gin Pro Gly Arg Pro Met 
13 0 13 5 14 0 

Lys Leu Ser Cys Val Ala Ser Gly Phe Thr Phe Ser Asp Tyr Trp Met 
145 150 155 160 

Asr. Trp Val Arg Gin Ser Pro Glu Lys Gly Leu Glu Trp Val Ala Gin 

165 170 175 

He Arg Asn Lys Pro Tyr Asn Tyr Glu Thr Tyr Tyr Ser Asp Ser Val 
180 185 190 

Lys Gly Arg Phe Thr lie Ser Arg Asp Asp Ser Lys Ser Ser Val Tyr 
195 200 205 

Leu Gin Mec Asn Aen Leu Arg Val Glu Asp Met Gly lie Tyr Tyr Cys 
210 215 220 

Thr Gly Ser Tyr Tyr Gly Met Asp Tyr Trp Gly Gin Gly Thr Ser Val 
225 230 235 240 

Thr Val Ser Ser Asp Lyo Thr His Thr Cys ♦ 
245 250 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 770 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 
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(ix) FEATURES 

(A) -NAME/KEY: CDS 

(B) rOCATION: 1..7SS 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 

GAG GTC GTT ATG ACT CAG ACA CCA CTA TCA CTT CCT GTT ACT CTA GGT 48 
isp Val Val Met Thr Gin Thr Pro Leu Ser Leu Pro Val Ser Leu Gly 

1 S 10 

GAT CAA GCC TCC ATC TCT TGC AGA TCT AGT CAG AGC CTT GTA CAC AGT 96 
Sn Ala ser He Ser Cys Arg Ser Ser Gin Ser Leu Val Hxs Ser 
20 25 30 

AAT GGA AAC ACC TAT TTA CGT TGG TAG CTG CAG AAG CCA GGC CAG TCT 144 
ABn Thr Tyr Leu Arg Trp Tyr Leu Gin Lys Pro Gly Gin Ser 
35 40 45 

CCA AAG GTC CTG ATG TAG AAA GTT TCC AAC GGA TTT TCT GGG GTC CCA 192 
PTO Lys val Leu He Tyr Lye Val Ser Asn Arg Phe Ser Gly Val Pro 

50 S5 60 f 

6AC AGG TTC AGT GGC AGT GGA TCA GGG ACA GAT TTC ACA CTG AAG ATC 240 K 

^1 Arg ihe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Lys lie 

AGC AGA GTG GAG GCT GAG GAT CTG GGA GTT TAT TTC TGC TCT CAA AGT 288 
Ser Arg Val Glu Ala Glu Asp Leu Gly Val Tyr Phe Cys Ser Gin Ser 
85 90 9S 

ACA CAT GTT CCG TGG AC6 TTC GGT GGA GGC AGC AAG GTT GAA ATC AAA 336 
tSt His Pro Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lye 
100 lOS 110 

GGT TCT ACC TCT GGT TCT GGT AAA TCT TCT CAA GGT AAA GGT GAA GTT 384 
^ Ser ^ ser Gly Ser Gly Lys Ser Ser Glu Gly Lys Gly Glu Val 
' 115 120 125 

AAA CTG GAT GAG ACT GGA GGA GGC TTG GTG CAA CCT GGG AGG CCC ATG 432 
^ Leu Asp Glu Thr Gly Gly Gly Leu Val Gin Pro Gly Arg Pro Met 
130 135 140 

AAA CTC TCC TGT GTT GCC TCT GGA TTC ACT TTT AGT GAG TOG TGG ATG 480 
^ £eu ser Cys Val Ala Ser Gly Phe Thr Ph. Ser Asp Tyr Trp Met 

145 150 155 ^ 

AAC TGG GTC GGC CAG TCT CCA GAG AAA GGA CTG GAG TGG GTA GCA CAA 528 ./ 

Asn Trp Val Arg Gin Ser Pro Glu Lys Gly Leu Glu Trp Val Ala Gin 

170 X/!> 



X65 



576 



ATT AGA AAC AAA CCT TAT AAT TAT GAA ACA TAT TAT TCA GAT TCT GTG 
^ jlsn ^ Pro Tyr Asn Tyr Glu Thr Tyr Tyr Ser A-p Ser Val 

180 

AAA GGC AGA TTG ACC ATC TCA AGA GAT GAT TCC AAA AGT AGT GTC TAG 624 
£^ lly Arg Phe Thr He Ser Arg Asp Asp Ser Lys Ser Ser Val Tyr 

' 200 20S 

CTG CAA ATG AAC AAC TTA AGA GTT GAA GAG ATG GGT ATC TAT TAC TGT 
Su ^ Met Xsn Asn Leu Arg Val Glu Asp Met Gly He Tyr Tyr Cys 
210 215 220 

ACG GGT TCT TAC TAT GGT ATG TAC TGG GGT CAA GGA ACC TCG GTC 

Thr Gly Ser Tyr Tyr Gly Met Asp Tyr Trp Gly Gin Gly Thr Ser Val 

225 

ACC GTC TCC AGT GAT AAG ACC CAT ACA TGC CCT CCA TGC TAA TAGGATCC 
Thr val Ser Ser Asp Lys Thr His Thr Cys Pro Pro Cys 
245 250 



672 



720 



770 
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(2) imX>RK;VTION FOR SEQ ID NO:!?: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2S4 amino acids 

(B) TYPE: amino acid 
(D) ^TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Asp Val Val Met Thr Gin Thr Pro Leu Ser Leu Pro Val Ser Leu Gly 
15 10 15 

Asp Gin Ala Ser lie Ser Cys Arg Ser Ser Gin Ser Leu Val His Ser 
20 25 30 

Asn Gly Asn Thr Tyr Leu Arg Trp Tyr Leu Gin Lys Pro Gly Gin Ser 
35 40 45 

Pro Lys Val Leu lie Tyr Lys Val Ser Asn Arg Phe Ser Gly Val Pro 
50 55 60 

Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Lys lie 
65 70 75 80 

Ser Arg Val Glu Ala Glu Asp Leu Gly Val Tyr Phe Cys Ser Gin Ser 
65 90 ' 95 

Thr His Val Pro Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu lie Lys 
100 105 110 

Gly Ser Thr Ser Gly Ser Gly Lys Ser Ser Glu Gly Lys Gly Glu Val 
115 120 125 

Lys Leu Asp Glu Thr Gly Gly Gly Leu Val Gin Pro Gly Arg Pro Mec 
130 135 140 

Lys Leu Ser Cye Val Ala Ser Gly Phe Thr Phe Ser Asp Tyr Trp Met 

145 150 155 160 

■ Asn Trp Val Arg Gin Ser Pro Glu Lys Gly Leu Glu Trp Val Ala Gin 
165 170 175 

lie Arg Asn Lys Pro Tyr Asn Tyr Glu Thr Tyr Tyr Ser Asp Ser Val 
180 185 190 

/■ > Lys Gly Arg Phe Thr He Ser Arg Asp Asp Ser Lys Ser Ser Val Tyr 

V.> 195 200 205 

Leu Gin Met Asn Asn Leu Arg Val Glu Asp Met Gly He Tyr Tyr Cys 
210 215 220 

Thr Gly Ser Tyr Tyr Gly Met Asp Tyr Trp Gly Gin Gly Thr Ser Val 
225 230 235 240 

Thr Val Ser Ser Asp Lye Thr Hie Thr Cys Pro Pro Cys • 
245 250 

(2) INFORMATION FOR SEQ ID NO:ie: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1460 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1398 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
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GAC GTC GTG ATG TCA CAG TCT CCA TCC TCC CTA CCT GTG TCA GTT GGC 48 
Asp Val Val Met Ser Gin Ser Pro Ser Ser Leu Pro Val Ser Val Gly 
1^5 10 15 

GAG AAG GTT ACT TTG AGC TGC AAG TCC AGT CAG AGC CTT TTA TAT AGT 96 
Glu Lys Val Thr Leu Ser Cye Lys Ser Ser Gin Ser Leu Leu Tyr Ser 
20 - 25 30 

GGT AAT CAA AAG AAC TAC TTG GCC TGG TAC CAG CAG AAA CCA GGG CAG 144 
Gly Asn Gin Lys Asn Tyr Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin 

35 40 45 ^ 

TCT CCT AAA CTG CTG ATT TAC TGG GCA TCC GCT AGG GAA TCT GGG GTC 192 
Ser Pro Lys Leu Leu lie Tyr Trp Ala Ser Ala Arg Glu Ser Gly Val 
50 55 60 

CCT GAT CGC TTC ACA GGC AGT GGA TCT GGG ACA GAT TTC ACT CTC TCC 240 
Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Ser 
65 70 75 80 

ATC AGC AGT GTG AAG ACT GAA GAC CTG GCA GTT TAT TAC TGT CAG CAG 288 
lie Ser Ser Val Lys Thr Glu Asp Leu Ala Val Tyr Tyr Cys Gin Gin 

85 90 95 ^' 

TAT TAT AGC TAT CCC CTC ACG TTC GGT GCT GGG ACC AAG CTT GTG CTG " 336 
Tyr Tyr Ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu Val Leu 
100 105 110 

AAA GGC TCT ACT TCC GGT AGC GGC AAA TCC TCT GAA GGC AAA GGT CAG 384 
Lys Gly Ser Thr Ser Gly Ser Gly Lys Ser Ser Glu Gly Lys Gly Gin 
lis 120 125 

GTT CAG CTG CAG CAG TCT GAC GCT GAG TTG GTG AAA CCT GGG GCT TCA 432 
Val Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly Ala Ser 
130 135 140 

GTG AAG ATT TCC TGC AAG GCT TCT GGC TAC ACC TTC ACT GAC CAT GCA 480 
Val Lys lie Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp His Ala 
145 ISO 155 160 

ATT CAC TGG GTG AAA CAG AAC CCT GAA CAG GGC CTG GAA TGG ATT GGA 528 
He His Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp He Gly 
165 170 175 

TAT TTT TCT CCC GGA AAT GAT GAT TTT AAA TAC AAT GAG AGG TTC AAG 576 
Tyr Phe Ser Pro Gly Asn Asp Asp Phe Lys Tyr Asn Glu Arg Phe Lys 
180 185 190 

CGC AAG GCC ACA CTG ACT GCA GAC AAA TCC TCC AGC ACT GCC TAC GTG 624 * ) 

Gly Lys Ala Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala Tyr Val 
195 200 205 

CAG CTC AAC AGC CTG ACA TCT GAG GAT TCT GCA GTG TAT TTC TGT ACA 672 
Gin Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr Phe Cys Thr 
210 215 220 



AGA TCC CTG AAT ATG GCC TAC TGG GGT CAA GGA ACC TCA GTC ACC GTC 720 
Arg Ser Leu Asn Met Ala Tyr Trp Gly Gin Gly Thr Ser Val Thr Val 
225 230 235 240 

TCC TCA GAC GTC GTG ATG TCA CAG TCT CCA TCC TCC CTA CCT GTG TCA 768 
Ser Ser Asp Val Val Met Ser Gin Ser Pro Ser Ser Leu Pro Val Ser 
245 250 255 

GTT GGC GAG AAG GTT ACT TTG AGC TGC AAG TCC AGT CAG AGC CTT TTA 816 
Val Gly Glu Lys Val Thr Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu 
260 265 270 

TAT AGT GGT AAT CAA AAG AAC TAC TTG GCC TGG TAC CAG CAG AAA CCA 864 
•Tyr Ser Gly Asn Gin Lys Asn Tyr Leu Ala Trp Tyr Gin Gin Lys Pro 
275 280 285 

GGG CAG TCT CCT AAA CTG CTG ATT TAC TGG GCA TCC GCT AGG GAA TCT 912 
Gly Gin Ser Pro Lys Leu Leu He Tyr Trp Ala Ser Ala Arg Glu Ser 
290 295 300 
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GGG GTC CCT GAT CGC TTC ACA GGC AGT GGA TCT GGG ACA GAT TTC ACT 9 60 

Gly Val Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr 
305 - 310 315 320 

CTC TCC ATC AGC AGT GTG AAG ACT GAA GAC CTG GCA GTT TAT TAC TGT 1008 
Leu Ser lie Ser Ser Val Lys Thr Glu Asp Leu Ala Val Tyr Tyr Cys 
725 330 335 

CAG GAG TAT TAT AGC TAT CCC CTC ACG TTC GGT GCT GGG ACC AAG CTT 1056 
Gin Gin Tyr Tyr Ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu 
340 345 350 

GTG CTG AAA GGC TCT ACT TCC GGT AGC GGC AAA TCC TCT GAA GGC AAA 1104 
Val Leu Lys Gly Ser Thr Ser Gly Ser Gly Lys Ser Ser Glu Gly Lys 
355 360 365 

GGT CAG GTT CAG CTG CAG CAG TCT GAC GCT GAG TTG GTG AAA CCT GGG 1152 
Gly Gin Val Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly 
370 375 380 

GCT TCA GTG AAG ATT TCC TGC AAG GCT TCT GGC TAC ACC TTC ACT GAC 12 00 

Ala Ser Val Lye lie Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp 
385 390 395 400 

CAT GCA ATT GAC TGG GTG AAA CAG AAC CCT GAA CAG GGC CTG GAA TGG 1248 
His Ala lie His Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp 
405 410 415 

ATT GGA TAT TTT TCT CCC GGA AAT GAT GAT TTT AAA TAC AAT GAG AGG 1296 
lie Gly Tyr Phe Ser Pro Gly Asn Asp Asp Phe Lys Tyr Asn Glu Arg 
420 425 430 

TTC AAG GGC AAG GCC ACA CTG ACT GCA GAC AAA TCC TCC AGC ACT GCC 1344 
Phe Lys Gly Lys Ala Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala 
435 440 445 

TAC GTG CAG CTC AAC AGC CTG ACA TCT GAG GAT TCT GCA GTG TAT TTC 1392 
Tyr . Val Gin Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr Phe 
450 455 4 60 

TGT ACA AGA TCC CTG AAT ATG GCC TAC TGG GGT CAA GGA ACC TCA GTC 144 0 

Cysl Thr Arg Ser Leu Asn Met Ala Tyr Trp Gly Gin Gly Thr Ser Val 
465 470 475 480 

ACC GTC TCC TAA TAG GAT CC 14 6 0 

Thr Val Ser ♦ ♦ Asp 
485 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LiENGTH: 486 amino acids 

(B) TYPEt amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Asp Val Val Met Ser Gin Ser Pro Ser Ser Leu Pro Val Ser Val Gly 

15 10 15 

Glu Lys Val Thr Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser 
20 25 30 

Gly Asn Gin Lys Asn Tyr Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin 
35 40 45 

Ser Pro Lys Leu Leu lie Tyr Trp Ala Ser Ala Arg Glu Ser Gly Val 
50 55 60 

Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Ser 
65-70 75 80 
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Ile Ser S.r Val Lys Thr Glu Asp Leu Ala Val Tyr Ser Cys Gin Gin 

85 5° 
Tyr Tyr ser Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu Val Leu 



xoo 



Lys Gly ser Thr Ser Gly Ser Gly Lys Ser Ser Glu Gly Lys Gly Gin 



115 



Val Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly Ala Ser 

13"0 135 
val Lys He Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp His Ala 
X45 IS® 

lie Hie Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp lie Gly 
Tyr Phe Ser Pro Gly Asn Asp Asp Phe Lys Tyr Asn Glu Arg Phe Lys 



180 



Gly Lys Ala Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala Tyr Val ^ 

Gin Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr Phe Cys Thr 

210 215 220 

Arg Ser Leu Asn Mec Ala Tyr Trp Gly Gin Gly Thr Ser Val Thr Val 
225 230 23* 

ser ser Asp Val Val Met Ser Gin Ser Pro Ser Ser Leu Pro Val Ser 

245 250 •'^'^ 

Val Gly Glu Lys Val Thr Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu 
ZeO 265 

Tyr Ser Gly Asn Gin Lys Asn Tyr Leu Ala Trp Tyr Gin Gin Lys Pro 
^ 275 280 285 

Gly Gin ser Pro Lye Leu Leu He Tyr Trp Ala Ser Ala Arg Glu Ser 



29 0 



Gly Val pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr 
305 

Leu ser He Ser Ser Val Lys Thr Glu Asp Leu Ala Val Tyr Tyr Cys 

325 

Gin Gin Tyr Tyr ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu ^ j 

~-- 345 



340 



Val Leu Lys Gly Ser Thr Ser ciy Ser Gly Lys Ser Ser Glu Gly Lys 

355 360 
Gly Gin Val Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly 

' 370 375 380 

Ala ser Val Lys He Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp 
385 

His Ala He His Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp 
He Gly Tyr Phe Ser Pro Gly Asn Asp Asp Phe Lys Tyr Asn Glu Arg 



420 



Phe Lys Gly Lys Ala Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala 

Tyr val Gin Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr Phe 

450 ^55 
cys Thr Arg Ser Leu Asn Met Ala Tyr Trp Gly Gin Gly Thr Ser Val 



465 



Thr Val Ser * * Asp 
465 
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(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) "LENGTH: 725 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .723 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

GAC GTC GTT ATG ACT CAG ACA CCA CTA TCA CTT CCT GTT AGT CTA GGT 4 8 

Asp Val Val Met Thr Gin Thr Pro Leu Ser Leu Pro Val Ser Leu Gly 
15 10 15 

GAT CAA GCC TCC ATC TCT TGC AGA TCT AGT CAG AGC CTT GTA CAC AGT 96 
Asp Gin Ala Ser lie Ser Cys Arg Ser Ser Gin Ser Leu Val His Ser 
20 25 30 

AAT GGA AAC ACC TAT TTA CGT TGG TAC CTG CAG AAG CCA GGC CAG TCT 144 
Asn Gly Asn Thr Tyr Leu Arg Trp Tyr Leu Gin Lya Pro Gly Gin Ser 
35 40 45 

CCA AAG GTC CTG ATC TAC AAA GTT TCC AAC CGA TTT TCT GGG GTC CCA 192 
Pro Lys Val Leu lie Tyr Lys Val Ser Asa Arg Phe Ser Gly Val Pro 
50 55 60 

GAC AGG TTC AGT GGC AGT GGA TCA GGG ACA GAT TTC ACA CTC AAG ATC 24 0 

Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Lys lie 
65 70 75 80 

AGC AGA GTG GAG GCT GAG GAT CTG GGA GTT TAT TTC TGC TCT CAA AGT 28 8 

Ser Arg Val Glu Ala Glu Asp Leu Gly Val Tyr Phe Cys Ser Gin Ser 
85 90 95 

ACA CAT GTT CCG TGG ACG TTC GGT GGA GGC ACC AAG CTT GAA ATC AAA 336 
Thr His Val Pro Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lys 
100 105 110 

GGT TCT ACC TCT GGT AAA CCA TCT GAA GGC AAA GGT CAG GTT CAG CTG 384 
Gly Ser Thr Ser Gly Lys Pro Ser Glu Gly Lys Gly Gin Val Gin Leu 
lis 120 125 

CAG CAG TCT GAC GCT GAG TTG GTG AAA CCT GGG GCT TCA GTG AAG ATT 432 
1 /. Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly Ala Ser Val Lys He 

13 0 135 140 

TCC TGC AAG GCT TCT GGC TAC ACC TTC ACT GAC CAT GCA ATT CAC TGG 4 80 

Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp His Ala He His Trp 
145 ISO 155 160 

GTG AAA CAG AAC CCT GAA CAG GGC CTG GAA TGG ATT GGA TAT TTT TCT 528 
Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp He Gly Tyr Phe Ser 
165 170 175 

CCC GGA AAT GAT GAT TTT AAA TAC AAT GAG AGG TTC AAG GGC AAG GCC 576 
Pro Gly Aen Asp Asp Phe Lys Tyr Asn Glu Arg Phe Lys Gly Lys Ala 
160 185 190 

ACA CTG ACT GCA GAC AAA TCC TCC AGC ACT GCC TAC GTG CAG CTC AAC 624 
Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala Tyr Val Gin Leu Asn 
195 200 205 

AGC CTG ACA TCT GAG GAT TCT GCA GTG TAT TTC TGT ACA AGA TCC CTG 672 
Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr Phe Cys Thr Arg Ser Leu 
210 215 220 
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AAT ATG GCC TAG TGG GGT CAA GGA ACC TCA GTC ACC GTC TCC TAA TAG 720 

Met Ala fyr Trp Gly Gin Gly Thr Ser Val Thr Val Ser • • 
225 230 • 235 2*u 



725 

GAT CC 
Asp 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 241 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECOIiE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2l5 

ASP Val val Mac Thr Gin Thr Pro Leu Ser Leu Pro Val Ser Leu Gly 

1 S 10 " 

Asp Gin Ala Ser He Ser Cya Arg Ser Ser Gin Ser Leu Val His Sexr 
*^ 20 25 

Asn Gly Asn Thr Tyr Leu Arg Trp Tyr Leu Gin Lya ^o Gly Gin Ser 
35 40 ** 

Pro LVB Val Leu lie Tyr Lys Val Ser Asn Arg Phe Ser Gly Val Pro 
SO 5S 

Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Lys lie 

65 ~* 
Ser Arg Val Glu Ala Glu Asp Leu Gly Val Tyr Phe Cys Ser Gin Ser 



85 



Thr His Val Pro Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lys 
105 no 



100 



Gly ser Thr Ser Gly Lys Pro Ser Glu Gly Lys Gly Gin Val Gin Leu 
X15 ^20 

Gin Gin ser Asp Ala Glu Leu Val Lys Pro Gly Ala Ser Val Lys He 
130 135 140 

Ser cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp His Ala He His Trp 



145 



val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp He Gly Tyr Phe Ser 



165 



Pro Gly Asn Asp Asp Phe Lys Tyr Asn Glu Arg Phe Lys Gly Lys Ala 
180 1»S 

Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala Tyr Val Gin Leu Asn 

X95 200 

Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr Phe Cys Thr Arg Ser Leu 

210 21S 
Asn Met Ala Tyr Trp Gly Gin Gly Thr Ser Val Thr Val Ser • ^-^ 

225 2^0 

Asp 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 738 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : both 

(D) TOPOLOGY: both 
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(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..738 

(XI) SEQUENCE^ DESCRIPTION : SEQ ID NO:22: 

lAC 3TC GTG ATG TCA CAG TCT CCA TGC TCC CTA CCT GTG TCA GTT GGC 4 6 

A»p Val Val Met Ser Gin Ser Pro Ser Ser Leu Pro Val Ser Val Gly 
5 10 15 

GAG AAG GTT ACT TTG AGC TGC AAG TCC AGT CAG AGC CTT TTA TAT AGT 9 6 

-lu Lys Val Thr Leu Ser Cye Lye Ser Ser Gin Ser Leu Leu Tyr Ser 
20 25 30 

GCT AAT CAA AAG AAC TAC TTG GCC TGG TAC CAG CAG AAA CCA GGG CAG 144 
s:y Aen Gin Lys Asn Tyr Leu Ala Trp Tyr Gin Gin Lye Pro Gly Gin 
35 40 45 

rrr rcr AAA ctg ctg att tag tgg gca tcc gct agg gaa tct ggg gtc 192 
^•r Pro Lye Leu Leu lie Tyr Trp Ala Ser Ala Arg Glu Ser Gly Val 
SO 55 60 

rrr sat cgc ttc aca ggc agt gga tct ggg aca gat ttc act ctc tcc 240 
r 1 rro Aap Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Ser 

V ' ftS 70 75 80 

ATC AGC ACT GTG AAG ACT GAA GAC CTG GCA GTT TAT TAC TGT CAG CAG 288 
Ser Ser Val Lys Thr Glu Asp Leu Ala Val Tyr Tyr Cye Gin Gin 
85 90 95 

TAT TAT AGC TAT CCC CTC ACG TTC GGT GCT GGG ACC AAG CTT GTG CTG 336 
Tyr Tyr Ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu Val Leu 
100 105 110 

AAA GCC TCT ACT TCC GGT AAA CCA TCT GAA GGT AAA GGT GAA GTT AAA 384 
Ly» Gly Ser Thr Ser Gly Lys Pro Ser Glu Gly Lys Gly Glu Val Lys 
115 120 125 

CTC GAT GAG ACT GGA GGA GGC TTG GTG CAA CCT GGG AGG CCC ATG AAA 432 
Leu Asp Glu Thr Gly Gly Gly Leu Val Gin Pro Gly Arg Pro Met Lys 
130 135 140 

CTC TCC TGT GTT GCC TCT GGA TTC ACT TTT AGT GAC TAC TGG ATG AAC 48 0 

L«u Ser Cys Val Ala Ser Gly Phe Thr Phe Ser Asp Tyr Trp Met Asn 
14S 150 155 160 



o 



TGC GTC CGC CAG TCT CCA GAG AAA GGA CTG GAG TGG CTA GCA CAA ATT 528 
Trp Val Arg Gin Ser Pro Glu Lys Gly Leu Glu Trp Val Ala Gin lie 
165 170 175 

ACA AAC AAA CCT TAT AAT TAT GAA ACA TAT TAT TCA GAT TCT GTG AAA 576 
Arg Asn Lys Pro Tyr Asn Tyr G1\j Thr Tyr Tyr Ser Asp Ser Val Lys 
180 185 190 

CGC AGA TTC ACC ATC TCA AGA GAT GAT TCC AAA AGT AGT GTC TAC CTG 624 
Cly Arg Phe Thr lie Ser Arg Asp Asp Ser Lys Ser Ser Val Tyr Leu 
195 200 205 

CAA ATG AAC AAC TTA AGA GTT GAA GAC ATG GGT ATC TAT TAC TGT ACG 672 
Gin Mec Asn Asn Leu Arg Val Glu Asp Met Gly He Tyr Tyr Cye Thr 
210 215 220 

GGT TCT TAC TAT GGT ATG GAC TAC TGG GGT CAA GGA ACC TCA GTC ACC 720 
Cly Ser Tyr Tyr Gly Met Asp Tyr Trp Gly Gin Gly Thr Ser Val Thr 
225 230 235 240 

CTC TCC TAA TAA GGA TCC 73 8 

Val Ser • * Gly Ser 
245 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) X^NGTH: 246 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



Snriry <wn 



W093/11161 W — PCT/US92/09965 

-62- 

(ii) MOLECULE TYPE: protein 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NOs23 : 
ASP Val V.1 Met Ser Gin Ser Pro Ser Ser Leu Pro V.l Ser Val Gly 

Glu Lys Val Thr Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser 



Gly Ash Gin Lys Asn Tyr Leu Ala Trp Tyr Gin Gin L,^ Pro Gly Gin 
ser Pro Ly. Leu Leu lie Tyr Trp Ala Ser Ala Arg Glu Ser Gly Val 
Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Ser 



65 •'O '5 



lie ser Ser Val Lye Thr Glu Asp Leu Ala Val Tyr Tyr Cys Gin Gin 

85 

Tyr Tyr ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Ly- Leu Val Leu 
xoo 

Lys Gly ser Thr Ser Gly Lys Pro Ser Glu Gly Lys Gly Glu Val Ly. 
lis 

I..U A.p Glu Thr Gly Gly Gly Leu Val Gin Pro Gly Arg Pro Met Ly. 

130 135 
Leu Ser cys Val Ala Ser Gly Phe Thr Phe Ser A.p Tyr Trp Met A.n 

145 ISO 

Trp val Arg Gin Ser Pro Glu Lys Gly Leu Glu Trp Val Ala Gin He 
Arg A.n Lys Pro Tyr A.n Tyr Glu Thr Tyr Tyr ser A.p Ser Val Lys 



180 



Gly Arg Phe Thr He Ser Arg Asp Asp Ser Lys Ser Ser Val Tyr Leu 
210 2XS 220 



195 200 
Gin Met Asn Asn Leu Arg Val Glu Asp Met Gly lie Tyr Tyr Cys Thr 



Gly ser Tyr Tyr Gly Met Asp Tyr Trp Gly Gin Gly Thr Ser Val Thr 

225 

Val Ser » • Gly Ser 
24S 
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What Is Claimed Is: 

1. A multivalent antigen-binding protein comprising two or more 
single-chain molecules, each single-chain molecule comprising: 

(a) a first polypeptide comprising the binding portion of the 
variable region of an antibody heavy or light chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy or light chain; and 

(c) a peptide linker linking said first and second polypeptides 
(a) and (b) into said single-chain molecule. 

2. The multivalent protein of claim 1 wherein said first polypeptide 
comprises the binding portion of the variable region of an antibody light chain, 
and said second polypeptide comprises the binding portion of the variable 
region of an antibody heavy chain. 

3. The multivalent protein of clsum 1 wherein said first polypeptide 
comprises the binding portion of the variable region of an antibody light chain, 
and said second polypeptide comprises the binding portion of the variable 
region of an antibody light chain. 

4. The multivalent protein of claim 1 wherein said first polypeptide 
comprises the binding portion of the variable region of an antibody heavy 
chain, and said second polypeptide comprises the binding portion of the 
variable region of an antibody heavy chain. 

5. The multivalent protein of claims 1, 2, 3, or 4 comprising a 
bivalent antigen-binding protein. 

6. The multivalent protein of claim 5 comprising a heterobivalent 
antigen-binding protein. 
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7. The multivalent protein of claim 5 comprising a homobivalent 
antigen-binding protein. 

8. A composition comprising a multivalent antigen-binding protein 
substantially free of single-chain molecules, wherein said multivalent protein 

5 comprises two or more single-chain molecules, each single-chain molecule 

comprising: 

(a) a first polypq>tid& comprising the binding portion of the 
variable region of an antibody heavy or light chain; 

(b) a second polypq>tide comprising the binding portion of 
10 the variable region of an antibody heavy or light chain; and 

(c) apeptide linker linking said first and second polypeptides 
(a) and (b) into said single-chain molecule. 

9. The composition of claim 8 wherein said first polypeptide 
comprises die binding portion of die variable region of an antibody light chain, 

15 and said second polypeptide comprises the binding portion of tiie variable 

region of an antibody heavy chain, 

10. The composition of claim 8 wherein said first poIyp^O'de 
comprises die binding portion of the variable region of an antibody light chain, 
and said second polypeptide comprises the binding portion of die variable 

20 region of an antibody light chain. 

11. The composition of claim 8 wherein said first polypeptide 
comprises the binding portion of die variable region of an antibody heavy 
chain, and said second polypeptide comprises die binding portion of die 
variable region of an antibody heavy chain. 

25 12. The composition of claims 8, 9, 10, or 11, comprising a 

bivalent antigen-binding protein substantially free of single-chain molecules. 
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13. The composition of claim 12 wherein said bivalent protein is 
heterobivalent. 

14. The composition of claim 12 wherein said bivalent protein is 
homobivalent. 

5 15. An aqueous composition comprising an excess of multivalent 

antigen-binding protein over single-chain molecules, said multivalent protein 
comprising two or more single-chain molecules, each single-chain molecule 
comprising: 

(a) a first polypeptide comprising the binding portion of the 
10 variable region of an antibody heavy or light chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy or light chain; and 

(c) a peptide linker linking said first and second polypeptides 
(a) and (b) into said single-chain protein. 

15 16. The aqueous composition of claim 15 wherein at least one of 

said single-chain molecules comprises: 

(a) a first polypeptide comprising the binding portion of the 
variable region of an antibody light chain; 

(b) a second polypeptide comprising the binding portion of 
20 the variable region of an antibody heavy chain; and 

(c) a peptide linker linking said first and second polypeptides 
(a) and (b) into said single-chain protein. 

17. The aqueous composition of claim 15 wherein at least one of 
said single-chain molecules comprises: 
25 (a) a first polypeptide comprising the binding portion of the 

variable region of an antibody light chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody light chain; and 
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(c) a peptide linker linking said first and second polypeptides 
(a) and (b) into said single-chain protein. 

18. The composition of claim 15 wherein at least one of said single- 
chain molecules comprises: 

5 (a) a first polypeptide comprising the binding portion of the 

variable region of an antibody heavy chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy chain; and 

(c) apeptide linker linking said firstand second polypeptides 

10 (a) and (b) into said angle-chain protein. 

19. A method of producing a multivalent antigen-binding protein. 

comprising the steps of: 

(a) producing acomposition comprising multivalentantigen- 

binding protein and single-chain molecules, each single-chain molecule 
15 comprising: 

(i) a first polypeptide comprising the binding portion 
of the variable region of an antibody heavy or light chain; 

(ii) a second polypeptide comprising the binding 
portion of the variable region of an antibody heavy or light chain; and 

(iii) a peptide linker linking said first and second 
polypeptides (a) and (b) into said single-chain molecule; 

(b) separating said multivalentprotein from said single-chain 

molecules; and 

(c) recovering said multivalent protein. 



20 



25 



20. The method of claim 19 wherein separating said multivalent 
protein from said single-chain molecules comprises utilizing cation exchange 
chromatography. 
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21. The method of claim 19 wherein separating said multivalent 
protein from said single-chain molecules comprises utilizing gel filtration 
chromatography. 

22. A method of producing a multivalent antigen-binding protein 
comprising the steps of: 

(a) producing a composition comprising single-chain 
molecules, each single-chain molecule comprising: 

(i) a first polypeptide comprising the binding portion 
of the variable region of an antibody heavy or light chain; 

(ii) a second polypeptide comprising the binding 
portion of the variable region of an antibody heavy or light chain; and 

(iii) a peptide linker linking said first and second 
polypeptides (a) and (b) into said single-chain molecule; 

(b) dissociating said single-chain molecules; 

(c) re-associating said single-chain molecules; 

(d) separating multivalent antigen-binding proteins from said 
single-chain molecules; and 

(e) recovering said multivalent proteins. 



23. The method of claim 22 wherein said dissociation is caused by 
dialysis against a dissociating solution. 



24. The method of claim 22 wherein said reassociation is caused by 
dialysis against a refolding solution or a refolding agent. 



25. A method of producing a multivalent antigen-binding protein, 
comprising the step of cross-linking at least two single-chain molecules to each 
other, each single-chain molecule comprising: 

(a) a first polypeptide comprising the binding portion of the 
variable region of an antibody heavy or light chain; 
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(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy or light chain; and 

(c) a peptide linker linking said first and second polypeptides 
(a) and (b) into said single-chain molecule. 

5 26. The method of claim 25 wherein said cross-linking is effected 

by chemical means. 

27. A method of producing a multivalent antigen-binding protein. 

comprising the steps oft 

(a) producing a composition comprising single-chain 

10 molecules, each single-chain molecule comprising: 

(i) a first polypeptide comprising the bindingportion 

of the variable region of an antibody heavy or light chain; 

(ii) a second polypeptide comprising the binding 
portion of the variable region of an antibody heavy or light chain; and 

(iii) a peptide linker linking said first and second 
polypeptides (a) and (b) into said single-chain molecule; 

(b) concentrating said single-chain molecules; 

(c) separating said multivalentprotein from said single-chain 

molecules; and 

2Q (d) recovering said multivalent protein. 

28. The method of claim 27 wherein said concentrating step occurs 
from approximately 0.5 mg/ml single-chain molecule to die concentration at 
which precipitation starts. 

29. A metiiod of detecting an antigen in or suspected of being in a 

25 sample, which comprises: 

(a) contacting said sample witii Hie multivalent antigen- 
binding protein of claim 1; and 



15 
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(b) detecting whether said multivalent antigen-binding • 
protein has bound to said antigen. 



30. A method of imaging the internal structure of an animal, 
comprising administering to said animal an effective amount of a labeled form 
S of the multivalent antigen-binding protein of claim 1 and measuring detectable 

radiation associated with said animal. 



31. A composition comprising an association of a multivalent 
antigen-binding protein as claimed in any one of claims 1-4, 8-11, or 15-lB 
with a therapeutically or diagnostically effective agent. 



o 



10 32. A single-chain protein comprising: 

(a) a first polypeptide comprising the binding portion of the 
variable region of an antibody light chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody light chain; 

15 (c) a peptide linker linking said first and second polypeptides 

(a) and (b) into said single-chain protein. 

33. A single-chain protein comprising: 

(a) a first polypeptide comprising the binding portion of the 
variable region of an antibody heavy chain; 
20 (b) a second polypeptide comprising the binding portion of 

the variable region of an antibody heavy chain; 

(c) a peptide linker linking said first and second polypeptides 
(a) and (b) into said single-chain protein. 



34. A single-chain protein comprising: 
25 (a) a first polypeptide comprising the Vl or Vh of a CC49 

monoclonal antibody; 



-70- 

(b) a second polypeptide comprising the Vl or Vh of a CC49 

monoclonal anribody; and 

(c) a peptide linker linking said first and second polypeptides 

(a) and (b) into said single-chain protein. 

35. The single-chain protein of claim 34 wherein said linker is 
selected from tiie group consisting of tiie 202', 212, 216. and 217 linkers. 

36. A angle-chain protein comprising: 

(a) a first polypeptide comprising the Vl or Vh of a CC49 

monoclonal antibody; 

(b) a second polypeptide comprising the Vl or Vh of a 4-4- 

20 monoclonal antibody; and 

(c) apeptide linker linking saidfirstand second polypeptides 

(a) and (b) into said single-chain protein. 

37. The singleKJhain protein of claim 36 wherein said linker is 
selected from tiie group consisting of tiie 202', 212. 216. and 217 linkers. 

38. A genetic sequence which codes for die single-chain protein of 

claim 32. comprising: 

(a) a DNA sequence coding for a first polypeptide 
comprising tiie binding portion of tiie variable region of an antibody light 
chain; 

(b) a DNA sequence coding for a second polypeptide 
comprising tiie binding portion of tiie variable region of an antibody light 
chain; 

(c) a DNA sequence coding for a peptide linker linking said 
first and second polypeptides (a) and (b) into said single-chain protein. 



39. A genetic sequence which codes for tiie single-chain protein 
claim 33, comprising: 
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(a) a DNA sequence coding for a first polypeptide 
comprising the binding portion of the variable region of an antibody heavy 
chain; 

(b) a DNA sequence coding for a second polypeptide 
comprising the binding portion of the variable region of an antibody heavy 
chain; 

(c) a DNA sequence coding for a peptide linker linking said 
first and second polypeptides (a) and (b) into said single-chain protein. 

40. A genetic sequence which codes for the single-chain protein of 
claim 34, comprising: 

(a) a DNA sequence coding for the or V„ of a CC49 
monoclonal antibody; 

(b) a DNA sequence coding for the Vl or Vh of a CC49 
monoclonal antibody; 

(c) a DNA sequence coding for a peptide linker linking said 
first and second polypeptides (a) and (b) into said single-chain protein. 

41. The genetic sequence of claim 40 wherein said DNA sequence 
codes for a peptide linker selected from the group consisting of the 202\ 212, 
216, and 217 linkers. 

42. A genetic sequence which codes for the single-chain protein of 
claim 36, comprising: 

(a) a DNA sequence coding for the or V„ of a CC49 
monoclonal antibody; 

(b) a DNA sequence coding for the or V„ of a 4-4-20 
monoclonal antibody; 

(c) a DNA sequence coding for a peptide linker linking said 
first and ^^x)nd polypeptides (a) and (b) into said single-chain protein. 
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43. The genetic sequence of claim 42 wherein said DNA sequence 
codes for a peptide linker selected from the group consisting of the 202', 212, 
216, and 217 linkers. 

44. A multivalent single-chain antigen-binding protein comprising: 
5 (a) a first polypeptide comprising the binding portion of the 

variable region of an antibody heavy or light chmn; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy or light diain; 

(c) apeptide linker linkingsaidfirstand second polypeptides 

i 

10 (a) and (b) into said multivalent protein; 

(d) a third polypeptide comprising the binding portion of the 
variable region of an antibody heavy or light chain; 

(e) a fourth polypeptide comprising the binding portion of 
the variable region of an antibody heavy or light chain; 

15 (f) a peptide linker linking said third and fourth polypeptides 

(d) and (e) into said multivalent protein; and 

(g) a peptide linker linking said second and third 
polypeptides (b) and (d) into said multivalent protein. 

45. A multivalent single-chain antigen-binding protein comprising: 
20 (a) a first polypeptide comprising the binding portion of the 

variable region of an antibody light chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy chain; 

(c) apeptide linker linking said firstand second polypeptides 
25 (a) and (b) into said multivalent protein; 

(d) a third polypeptide comprising the binding portion of the 

variable region of an antibody light chain; 

(e) a fourth polypeptide comprising the binding portion of 
the variable region of an antibody heavy chain; 
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(f) a peptide linker linking said third and fourth polypeptides 
(d) and (e) into said multivalent protein; and 

(g) a peptide linker linking said second and third 
poI>peptides (b) and (d) into said multivalent protein. 

5 46. A genetic sequence which codes for the multivalent antigen- 

binding protein of claim 44 or 45, comprising: 

(a) a DNA sequence coding for a first polypeptide 
comprising the binding portion of the variable region of an antibody heavy or 
^^-^ light chain; 

'10 (b) a DNA sequence coding for a second polypeptide 

comprising the binding portion of the variable region of an antibody heavy or 
light chain; 

(c) a DNA sequence coding for a peptide linker linking said 
first and second polypeptides (a) and (b) into said multivalent protein 
15 (d) a DNA sequence coding for a third polypeptide 

comprising the binding portion of the variable region of an antibody heavy or 
light chain; 

(e) a * DNA sequence coding for a fourth polypeptide 
comprising the binding portion of the variable region of an antibody heavy or 

20 light chain; 

(f) a DNA sequence coding for a peptide linker linking said 
third and fourth polypeptides (d) and (e) into said multivalent protein; and 

(g) a DNA sequence coding for a peptide linker linking said 
second and third polypeptides (b) and (d) into said multivalent protein. 

25 47. A replicable cloning or expression vehicle comprising the DNA 

sequence of any one of claims 38-43. 

48. The vehicle of claim 47 which is a plasmid. 

49. A host cell transformed with the vehicle of claim 47. 



O 
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50. The host cell of daim 49 which is a bacterial cell, a yeast cell 
or other fungal cell, or a mammalian cell line. 

51. A method of producing a multivalent antigen-binding protein 
comprising two or more single-chain molecules, each single-chain molecule 
comprising: 

(a) a first polypeptide compriang the binding portion of the 
variable region of an antibody heavy or light chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy or light chain; and 

(c) a peptide linker linking said first and second polypq»tides 
(a) and (b) into said single-chain molecule, said method comprising: 

(X) providing a genetic sequence coding for said 

single-chain molecule; 

(li) transforming one or more host cells with said 

15 sequence; 

(iii) expressing said sequence in said host or hosts; 

and 

(iv) recovering a multivalent protein from said host 

or hosts. 



10 
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52. A mefliod of producing a multivalent single-chain antigen- 
binding protein comprising two or more single-chain molecules, each single- 
chain molecule comprising: 

(a) a first polypeptide comprising the binding portion of the 

variable region of an antibody heavy or light chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy or light chain; 

(c) apeptide linker linkingsaid first and second polypeptides 

(a) and (b) into said multivalent protein; 

(d) a third polypeptide comprising die binding portion of tije 

30 variable region of an antibody heavy or light chain; 



25 
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(e) a fourth polypeptide comprising the binding portion of 
the variable region of an antibody heavy or light chain; 

(f) a peptide linker linking said third and fourth polypeptides 
(d) and (e) into said multivalent protein; and 

S (g) a peptide linker linking said second and third 

polypq)tides (b) and (d) into said multivalent protein, said method comprising: 

(i) providing a genetic sequence coding for said 
single-chain molecule; 

(ii) transforming one or more host cells with said 

10 sequence; 

(iii) expressing said sequence in said host or hosts; 
and 

(iv) recovering a multivalent protein from said host 

or hosts. 

15 53. The method of claim 51 or 52 wherein recovering said 

multivalent protein comprises separating said multivalent protein from said 
single-chain molecules. 

54. The method of claim 51 or 52 wherein recovering said 
^ multivalent protein comprises: 

20 (a) dissociating said single-chain molecules; 

(b) re-associating said single-chain molecules; 

(c) separating multivalentantigen-binding proteins from said 
single-chain molecules; and ^ 

(d) recovering said multivalent proteins. 

25 55. The method of claim 51 or 52 which further comprises 

purifying said recovered multivalent protein. 

56. The method of claim 51 or 52 wherein said host cell is a 
bacterial cell, a yeast cell or other fungal cell, or a mammalian cell line. 
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57. The method of claim 56 wherein said host cell is E. coli or 
Bacillus subtilis. 

58. Themultivalentantigen-bindingproteinofclaim 1 in detectably- 
labelled form. 



59. In an immunoassay method which utilizes an antibody in 
dcicctably-Iabelled form, the improvement comprising using the multivalent 
protein of claim 58 instead of said antibody. 

60. The immunoassay of claim 59 wherein said immunoassay is a 
competitive immunoassay. 

61. The immunoassay of claim 59 wherein said immunoassay is a 
sandviich immunoassay. 

62. In an immunotherapeutic method which utilizes an antibody 
conjugated to a therapeutic agent, the improvement comprising using the 
multivalent protein of claim 1 instead of said antibody. 

63. In a method of immunoaffinity purification which utilizes an 
antibody therefor, the improvement which comprises using the molecule of 
claim 1 instead of smd antibody. 
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4-4-aO Vl/212/CC49 Vh gene 

4-4-20 Vl 10 20 

AsD Val Val Met Thr Gin Thr Pro Leu Ser Leu Pro Val Ser Leu Gly Asp Gin AU Ser 

GAC GTC GTT ATG ACT CAG ACA CCA CTA TCA CTT CCT GTT AGT CTA GGT GAT CAA GCC TCC 
Aat II 

He Ser Cys Aro Ser Ser Gin Ser Leu Val His Ser Asn Gly Asn Thr Tyr Leu Arg Trp 
ATC TCT TGC A6A TCT AGT CAG AGC CTT GTA CAC AGT AAT GGA AAC ACC TAT TTA C6T T6G 

50 60 
Tvr Leu Gin Lys Pro Gly Gin Ser Pro Lys Val Leu He Tyr Lys Val Ser Asn Arg Phe 
TAC CTG CAG AAG CCA 6GC CAG TCT CCA AAG GTC CTG ATC TAC AAA GTT TCC AAC CGA TTT 

70 80 
Ser Glv Vol Pro Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Lys He 
TCT 6GG GTC CCA GAC AGG TTC AGT G6C AGT GGA TCA 6G6 ACA GAT TTC ACA CTC AAG ATC 

90 

Ser Aro Val Glu Ala Glu Asp Leu Gly Val Tyr Phe Cys Ser Gin Ser Thr His Vol Pro 
AGC AGA GTG GAG GCT GAG GAT CTG GGA GH TAT TTC TGC TCT CAA AGT ACA CAT GTT CCG 

HO 212 Linker 120 

Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lys Gly Ser Thr Ser Gly Ser Gly Lys 
TGG ACG TTC GGT GGA 6GC ACC AAG CTT GAA ATC AAA GGT TCT ACC TCT GGT TCT GGT AAA 

Hind III { ) 

CC49VH 130 HO 
Ser Ser Glu Gly Lys Gly Gin Val Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys Pro 
TCC TCT GAA 6GC AAA GGT CAG GH CAG CTG CAG CAG TCT GAC GCT GAG TT6 GTG AAA CCT 

Pvull Psil 

150 160 
Glv Alo Ser Vol Lys lie Ser Cys Lys Alo Ser Gly Tyr Thr Phe Thr Asp His Ala He 
GG6 GCT TCA GTG AAG AH TCC TGC AAG GCT TCT GGC TAC ACC TTC ACT GAC CAT GCA ATT 

170 180 
His Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp He Gly Tyr Phe Ser Pro Gly 
CAC TGG GTG AAA CAG AAC CCT GAA CAG GGC CTG GAA TGG ATT GGA TAT TH TCT CCC GGA 

FIG. 1 0A 
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4-<-:c Vj_/2i2/CC49 Vh gene 

190 200 

Asr. Asp Asp Phe Lys Tyr Asn Glu Arg Phe Lys Gly Lys Alo Thr Leu Thr Ala Asp Lys 

AAI GAf GAT TTT AAA TAC AAT GAG AGG TTC AAG 66C AA6 GCC ACA CTG ACT GCA GAC AAA 

210 220 

Ser Ser Ser Thr Ala Tyr Vat Gin Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr 

TCC TCC AGC ACT GCC TAC GT6 CAG CTC AAC AGC CTG ACA TCT GAG GAT TCT GCA 6TG TAT 

230 240 

TTC TGT ACA AGA TCC CT6 AAT ATG GCC TAC TGG GGT CAA GGA ACC TCA GTC ACC GTC TCC 

Phe Cys Thr Arg Ser Leu Asn Met Ala Tyr Trp Gly Gin Gly Thr Ser Val Thr Val Ser 



III HI Asp 

TAA TA G GAT CC 
Ban HI 
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CC49 VL/212/4-4-E0 Vh gene 
CC49 Vi 

Asp Val Val Met Ser Gin Ser Pro Ser Ser Leu Pro Val Ser Val Gly Glu Lys Val Thr 
GAC GTC GTG ATG TCA CAG TCT CCA TCC TCC CTA CCT GTG 7CA GTT G6C GAG AAG GTT ACT 
Aa-t II 

30 ^0 
Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser Gly Asn Gin Lys Asn Tyr Leu Ala 
TTG AGC TGC AAG TCC A6T CAG AGC CTT TTA TAT AGT GGT AAT CAA AAG AAC TAC TT6 6CC 

50 60 
Trp Tyr Gin Gin Lys Pro Gly Gin Ser Pro Lys Leu Leu lie Tyr Trp Ala Ser Ala Arg 
T6G TAC CAG CAG AAA CCA GGG CAG TCT CCT AAA CT6 CTG ATT TAC TGG GCA TCC GCT AGG 

70 80 
Glu Ser Gly Val Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Ser 
GAA TCT GGG GTC CCT GAT CGC TTC ACA G6C AGT GGA TCT GGG ACA GAT TTC ACT CTC TCC 

90 100 
He Ser Ser Val Lys Thr Glu Asp Leu Ala Val Tyr Tyr Cys Gin Gin Tyr Tyr Ser Tyr 
ATC AGC AGT GTG AAG ACT GAA GAC CTG GCA GTT TAT TAC TGT CAG CAG TAT TAT AGC TAT 

no 212 Linker 120 

Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu Val Leu Lys Gly Ser Thr Ser Gly Ser Gly 
CCC CTC ACG TTC GGT GCT GGG ACC AAG CTT GTG CTG AAA QGC TCT ACT TCC GGT AGC GGC 

Hind HI 
4-4-20 Vh 

L ys Ser Ser Glu Gly Lys Gly Glu Val Lys Leu Asp Glu Thr Gly Gly Gly Leu Val Gin 
AAA TCT TCT GAA GGT AAA GGT GAA GTT AAA CTG GAT GAG ACT GGA GGA GGC HG GTG CAA 

150 160 
Pro Gly Arg Pro Met Lys Leu Ser Cys Val Ala Ser Gly Phe Thr Phe Ser Asp Tyr Trp 
CCT GGG AGG CCC ATG AAA CTC TCC TGT GTT GCC TCT GGA TTC ACT TTT AGT GAC TAC TGG 

170 180 
Met Asn Trp Val Arg Gin Ser Pro Glu Lys Gly Leu Glu Trp Val Ala Gin He Arg Asn 
ATG AAC TGG GTC CGC CAG TCT CCA GAG AAA GGA CTG GAG TGG GTA GCA CAA AH AGA AAC 

FIG. 1 OB 
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CC49 Vl/212/4-4-20 Vj^ gene 

190 200 

Lys Pro Tyr Asn Tyr Glu Thr Tyr Tyr Ser Asp Ser Val Lys Gly Arg Phe Thr He Ser 

AAA CCT TAT AAT TAT GAA ACA TAT TAT TCA GAT TCT GT6 AAA GGC AGA TTC ACC ATC TCA 

210 220 
Arg Asp Asp Ser Lys Ser Ser Val Tyr Leu Gin Wei Asn Asn Leu Arg Val Glu Asp Mei 
AGA GAT GAT TCC AAA ACT AGT GTC TAC CTG CAA ATG AAC AAC TTA AGA GTT GAA GAC ATG 

230 240 
Gly He Tyr Tyr Cys Thr Gly Ser Tyr Tyr Gly Me-t Asp Tyr Trp Gly Gin Gly Thr Ser 
GGT ATC TAT TAC TGT ACG GGT TCT TAC TAT GGT ATG GAC TAC TGG G6T CAA GGA ACC TCA 



Val Thr Val Ser x » Gly Ser 
GTC ACC GTC TCC TAA TAA GGA TCC 

Bon HI 
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4-4-20/212 pro-tein with single cysteine hinge 

4-4-20 Vi 

Asp Val Val Met Thr Gin Thr Pro Leu Ser Leu Pro Val Ser Leu Gly Asp Gin Ala Ser 
QAC GTC GTT ATG ACT CAG ACA CCA C7A TCA CTT CCT GTT AGT CTA GGT GAT CAA GCC TCC 

Aat II 40 
Hp Ser Cvs Aro Ser Ser Gin Ser Leu Val His Ser Asn Gly Asn Thr Tyr Leu Arg Trp 
ATC TCT T6C AGA TCT AGT CAG AGC CTT GTA CAC AGT AAT G6A AAC ACC TAT TTA CGT TGG 

50 60 
Tvr Leu Gin Lys Pro Gly Gin Ser Pro Lys Val Leu lie Tyr Lys Val Ser Asn Arg Phe 
TAC CT6 CAG AAG CCA GGC CAG TCT CCA AAG GTC CTG ATC TAC AAA GTT TCC AAC CGA TTT 

70 80 
Ser Glv Val Pro Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Lys lie 
TCT GGG GTC CCA GAC AGG TTC AGT GGC AGT GGA TCA GGG ACA GAT TTC ACA CTC AAG ATC 



90 



100 



Ser Aro Val Glu Ala Glu Asp Leu Gly Val Tyr Phe Cys Ser Gin Ser Thr His Val Pro 
AGC AGA GTG GAG GCT GAG GAT CTG GGA GTT TAT TTC TGC TCT CAA AGT ACA CAT GTT CCG 

110 212 Linker 120 

Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lys Glv Ser Thr Ser Gly Ser Gly Lys 

TGG ACG nC GGT GGA GGC ACC AAG CTT GAA ATC AAA GGT TCT ACC TCT GGT TCT GGT AAA 

Hind 111 
4-4-20 Vu 

Ser Ser Glu Gly Lys Gly Glu Val Lys Leu Asp Glu Thr Gly Gly Gly Leu Val Gin Pro 

TCT TCT GAA GGT AAA bbi GAA GTT AAA CTG GAT GAG ACT GGA GGA GGC TTG GTG CAA CCT 



150 

Glv Aro Pro Met Lys Leu Ser Cys Val Ala Ser Gly Phe Thr Phe Ser Asp Tyr Trp Met 
GGG TgG cS m AAA CTC TCC TGT GTT GCC TCT GGA TTC ACT TTT AGT GAC TAC TGG ATG 

170 ^80 
Asn Tro Val Arq Gin Ser Pro Glu Lys Gly Leu Glu Trp VqI Ala Gin He Arg Asn Lys 
AAC TGG GTC CGC CAG TCT CCA GAG AAA GGA CTG GAG TGG GTA GCA CAA ATT AGA AAC AAA 

FIG.15A 
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4-4-P0/212 pro-tein with single cys-tplne hinge 

\ ) 

190 200 
Pro Tyr Asn Tyr Glu Thr Tyr Tyr Ser Asp Ser Val Lys Gly Arg Phe Thr He Ser Arg 
CCT TAT AAT TAT GAA ACA TAT TAT TCA GAT TCT GTG AAA G6C AGA TTC ACC ATC TCA AGA 

210 220 
Asp Asp Ser Lys Ser Ser Val Tyr Leu Gin Met Asn Asn Leu Arg Val Glu Asp Met Gly 
GAT GAT TCC AAA A6T AGT 6TC TAC CTG CAA ATG AAC AAC TTA AGA GTT GAA GAC AT6 GGT 

230 240 
He Tyr Tyr Cys Thr Gly Ser Tyr Tyr Gly Met Asp Tyr Trp Gly Gin Gly Thr Ser Val 
ATC TAT TAC T6T ACG GGT TCT TAC TAT GGT ATG GAC TAC TGG GGT CAA GGA ACC TC G GTC 

Bst EII 

Hinge 250 
Thr Vttl Ser Ser Asp Lys Thr His Thr Cys x»« 
ACC GTC TCC AGT GAT AAG ACC CAT ACA T6C TAA TA G GAT a 

Ban HI 

pGx 5532, Gx 8932 



FIG.15A(C0NT.) 
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4-4-20/212 proiein with two cysteine hinge 
4-4-20 Yi 

AsD VqI Val Met Thr Gin Thr Pro Leu Ser Leu Pro Val Ser Leu Gly Asp Gin AU Ser 
GAC GTC GH AT6 ACT CAG ACA CCA CTA TCA CTT CCT GTT AGT CTA GGT GAT CAA GCC TCC 

Aat II 40 

He Ser Cys Aro Ser Ser Gin Ser Leu Val His Ser Asn Gly Asn Thr Tyr Leu Arg Trp 
ATC TCT TGC AM TCT AGT CAG AGC CTT OTA CAC AGT AAT GGA AAC ACC TAT TTA CGT TGG 

50 6« 
Tvr Leu Gin Lys Pro Gly Gin Ser Pro Lys VqI Leu He Tyr Lys Val Ser Asn Arg Phe 
TAC CTG CAG AAG CCA G6C CAG TCT CCA AAG GTC CTG ATC TAC AAA OTT TCC AAC CGA HT 

70 80 
Ser Glv Val Pro Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Lys He 
fc? GGG G?C cS G^C Att TTC AGT GGC AOT GGA TCA GGG ACA GAT TTC ACA CTC AAG ATC 

90 

Ser Aro VqI Glu AIq Glu Asp Leu Gly Val Tyr Phe Cys Ser Gin Ser Thr His Val Pro 
AGC aS CTG GAG GCT GAG GAT CTG GGA GU TAT TTC TGC TCT CAA AGT ACA CAT GTT CCG 

no 212 Linker 120 

Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu lie Lys Glv Ser Thr Ser Gly Ser Gly Lys 
TGG ACG TTC GGT GGA GGC ACC AAG CTT GAA ATC AAA GGT TCT ACC TCT GGT TCT GGT AAA 

Hind HI ... 
4-4-20 Yh 130 ^, 1^" 

Ser Ser Glu Gly Lys Gly Glu Val Lys Leu Asp Glu Thr Gly Gly Gly Leu VqI Gin Pro 
TCT TCT GAA GGT AAA bGI GAA GTT AAA CTG GAT GAG ACT GGA GGA GGC TTG 6TG CAA CCT 

150 

Glv Aro Pro Met Lys Leu Ser Cys Val Ala Ser Gly Phe Thr Phe Ser Asp Tyr Trp Met 
GGG S CCC ATG l^A CTC TCC T6T GTT GCC TCT GGA TTC ACT TTT AGT GAC TAC TGG AT6 

170 J8« 
Trn Val Aro Gin Ser Pro Glu Lys Gly Leu Glu Trp Val AIq Gin He Arg Asn Lys 
aIJ IgG Ik CGC CAG TCT CCA GAG AAA GGA CTG GAG TGG GTA GCA CAA ATT AGA AAC AAA 

FIG.15B 
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4-4-20/EJ2 protein with two cysteine hinge 

190 200 
Pro Tyr Asn Tyr Glu Thr Tyr Tyr Ser Asp Ser Val Lys Gly Arg Phe Thr He Ser Arg 
CCT TAT AAT TAT GAA ACA TAT TAT TCA GAT TCT GTG AAA GGC AGA TTC ACC ATC TCA AGA 

210 220 
Asp Asp Ser Lys Ser Ser Val Tyr Leu Gin Met Asn Asn Leu Arg Val Glu Asp Met Gly 
GAT GAT TCC AAA AGT AGT GTC TAC CTG CAA ATG AAC AAC HA AGA GTT GAA GAC ATG GGT 

230 240 
He Tyr Tyr Cys Thr Gly Ser Tyr Tyr Gly Met Asp Tyr Trp Gly Gin Gly Thr Ser Val 
ATC TAT TAC TGT ACG GGT TCT TAC TAT GGT ATG GAC TAC TGG GGT CAA GGA ACC TCG GTC 

Bst til 

Hinge 250 
Thr Val Ser Ser Asp Lys Thr His Thr Cys Pro Pro Cys »« 
ACC GTC TCC AGT GAT AAG ACC CAT ACA T6C CCT CCA TGC TAA TA G GAT CC 

Ban Hi 

pGx 5533. Gx 8933 

FIG.15B(C0NT.) 
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CC49/212 SCA™ protein genelic diner 
CC49 Vi 

Asp VqI Val Met Ser Gin Ser Pro Ser Ser Leu Pro Vftl Ser Val Gly Glu Lys Val Thr 
GAC GTC GTG ATG TCA CAG TCT CCA TCC TCC CTA CCT GTG TCA GTT GQC GAG AAG GTT ACT 

An 

Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser Gly Asn Gin Lys Asn Tyr Leu Ala 
TTG AGC TGC AAG TCC A6T CAG AGC CTT TTA TAT AGT 66T AAT CAA AAG AAC TAC TTG GCC 

50 60 
Trp Tyr Gin Gin Lys Pro Gly Gin Ser Pro Lys Leu Leu He Tyr Trp Ala Ser Ala Arg 
TG6 TAC CAG CAG AAA CCA GGG CAG TCT CCT AAA CTG CTG ATT TAC TGG GCA TCC GCT AGG 

70 80 
Glu Ser Gly Val Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Ser 
6AA TC7 GGG GTC CCT GAT CGC TTC ACA GGC AGT GGA TCT GGG ACA GAT TTC ACT CTC TCC 

90 100 
He Ser Ser Val Lys Thr Glu Asp Leu Ala Val Tyr Tyr Cys Gin Gin Tyr Tyr Ser Tyr 
ATC AGC AGT GTG AAG ACT GAA GAC CTG GCA GTT TAT TAC T6T CAG CAG TAT TAT AGC TAT 

110 212 Linker 120 

Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu Val Leu Lys Gly Ser Thr S er Gly Ser Gly 
CCC CTC ACG TTC 6GT GCT GGG ACC AAG CTT GTG CTG AAA GGC TCT ACT TCC GGT AGC GGC 

Hinol III 

CC49 Vh ^^0 
Lvs Ser Ser Glu Gly Lys Gly Gin Val Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys 
AAA TCC TCT GAA GGC AAA GGT CAG GTT CAG CTG CAG CAG TCT GAC GCT GAG TTG GTG AAA 

PvuII PstI 

150 160 
Pro Gly Ala Ser Val Lys He Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp His Ala 
CCT GGG GCT TCA GTG AAG ATT TCC TGC AAG GCT TCT GGC TAC ACC TTC ACT GAC CAT GCA 

170 180 
He His Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp He Gly Tyr Phe Ser Pro 
An CAC TGG GTG AAA CAG AAC CCT GAA CAG GGC CTG GAA TGG ATT GGA TAT TTT TCT CCC 

FIG.16A 
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::4S/?12 SCA™pro-te(n geneiic diner 

190 200 
:>■> Asn Asp Asp Phe Lys Tyr Asn Glu Arg Phe Lys Gly Lys Ala Thr Leu Thr Ala Asp 
GGf. AAT DAT GAT TH AAA TAG AAT GAG A66 TTC AAG GGC AAG GCC ACA CTG ACT GCA GAC 

210 220 
i ys Ser Ser Ser Thr Ala Tyr Val Gin Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala Val 
WJ, TCC TCC AGC ACT GCC TAC GTG CAG CTC AAC AGC CTG ACA TCT GAG GAT TCT GCA GTG 

230 240 
Tyr Fhe Cys Thr Arg Ser Leu Asn Mel Ala Tyr Trp Gly Gin Gly Thr Ser Vat Thr Val 
lAT TTC TGT ACA AGA TCC CTG AAT ATG GCC TAC TGG 6GT CAA GGA ACC TCA GTC ACC GTC 

CC49 Vl 250 260 

Ser Ser Asp Val Val Met Ser Gin Ser Pro Ser Ser Leu Pro Val Ser Val Gly Glu Lys 

TCC TCA GAC GTC GTG ATG TCA CAG TCT CCA TCC TCC CTA CCT GTG TCA GTT GGC GAG AAG 

Aoi II 

270 280 

Vol Thr Leu Ser Cys Lys Ser Ser 6ln Ser Leu Leu Tyr Ser Gly Asn Gin Lys Asn Tyr 

GTT ACT m AGC TGC AAG TCC AGT CAG AGC CTT TTA TAT AGT GGT AAT CAA AAG AAC TAC 

290 300 
Leu Alo Trp Tyr Gin Gin Lys Pro Gly Gin Ser Pro Lys Leu Leu He Tyr Trp Ala Ser 
TTG GCC TGG TAC CAG CAG AAA CCA GGG CAG TCT CCT AAA CTG CTG ATT TAC TGG GCA TCC 

310 320 
Ala Arg Glu Ser Gly Val Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr 
GCT AGG GAA TCT GGG GTC CCT GAT C6C TTC ACA GGC AGT GGA TCT GGG ACA GAT TTC ACT 

330 340 
Leu Se^ Me Ser Ser Val Lys Thr Glu Asp Leu Ala Val Tyr Tyr Cys Gin Gin Tyr Tyr 
CTC TCC ATC AGC AGT GTG AAG ACT GAA GAC CTG GCA GTT TAT TAC TGT CAG CAG TAT TAT 

350 212 Linker 360 

Ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu Val Leu Lys Gly Ser Thr Ser Gly 
AGC TAT CCC CTC ACG TTC GGT GCT GGG ACC AAG CTT GTG CTG AAA GGC TCT ACT TCC GGT 

Hinol III 

CC49 VH 380 
Se r Gly Lys Ser Ser Glu Gly Lys Gly Gin Val Gin Leu Gin 6ln Ser Asp Ala Glu Leu 
AGC GGC AAA TCC TCT GAA GGC AAA GGT CAG GTT CAG CTG CAG CAG TCT GAC GCT GAG TTG 

Pvull PstI 
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CC49/212 SCA™ protein genetic diner 

390 400 

Val Lys Pro Gly Ala Ser Val Lys lie Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp 

GT6 AAA CCT GGG GCT TCA GT6 AAG ATT TCC TGC AAG GCT TCT GGC TAG ACC nC ACT GAC 

410 420 

His Ala He His Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp He Giy Tyr Phe 

CAT GCA ATT CAC TG6 GTG AAA CA6 AAC CCT GAA CAG GGC CTG GAA TGG ATT GGA TAT TH 

430 ^40 

Ser Pro Gly Asn Asp Asp Phe Lys Tyr Asn Glu Arg Phe Lys Gly Lys Alo Thr Leu Thr 

TCT CCC GGA AAT GAT GAT TTT AAA TAC AAT GAG AGG TTC AAG GGC AAG GCC ACA CTG ACT 

450 460 

Ala Asp Lys Ser Ser Ser Thr Ala Tyr Val Gin Leu Asn Ser Leu Thr Ser Glu Asp Ser 

GCA GAC AAA TCC TCC AGC ACT GCC TAC GTG CAG CTC AAC A6C CTG ACA TCT GAG GAT TCT 

470 480 

Ala Val Tyr Phe Cys Thr Arg Ser Leu Asn Met Ala Tyr Trp Gly Gin Gly Thr Ser Val 

GCA GTG TAT TTC TGT ACA AGA TCC CTG AAT ATG GCC TAC TGG GGT CAA GGA ACC TCA GTC 



Thr Val Ser x»x Asp 
ACC GTC TCC TAA TA G GAT CC 
Ban HI 

FIG.16C 
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4-4-20 VL/217/CC49 Vh gene 

4-4-20 Vl 10 20 

Asp VqI Val Met Thr Gin Thr Pro Leu Ser Leu Pro Val Ser Leu Gly Asp Gin Ala Ser 

GAC GTC GTT ATG ACT CAG ACA CCA C7A TCA CTT CCT GTT ACT CTA GGT GAT CAA GCC TCC 
Aat II 

30 40 

lie Ser Cys Arg Ser Ser Gin Ser Leu Val His Ser Asn Gly Asn Thr Tyr Leu Arg Trp 

ATC TCT TGC A6A TCT AGT CAG A6C CTT GTA CAC AGT AAT G6A AAC ACC TAT m C6T TGG 

50 60 
Tyr Leu Gin Lys Pro Gly Gin Ser Pro Lys Val Leu lie Tyr Lys Val Ser Asn Arg Phe 
TAC CTG CAG AAG CCA GGC CAG TCT CCA AAG GTC CTG ATC TAC AAA GTT TCC AAC C6A TTT 

70 80 
Ser Gly Vat Pro Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Lys He 
TCT GGG GTC CCA GAC AGG TTC AGT GGC AGT 6GA TCA GGG ACA GAT TTC ACA CTC AAG ATC 

90 100 
Ser Arg Val Glu Mq Glu Asp Leu Gly Val Tyr Phe Cys Ser Gin Ser Thr His Vol Pro 
AGC A6A GTG GAG GCT GAG GAT CTG GGA GTT TAT TTC TGC TCT CAA AGT ACA CAT. GTT CCG 

1 10 217 Linker 120 

Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lys Gly Ser Thr Ser Gly Lys Pro Ser 
TGG ACG TTC GGT GGA GGC ACC AAG CTT GAA ATC AAA GGT TCT ACC TCT GGT AAA CCA TCT 

Hind III 

CC49 Vh 130 140 

Glu Gly Lys Gly Gin Vol Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly Ala 
GAA GGC AAA GGT CAG GH CAG CTG CAG CAG TCT GAC GCT GAG TTG GTG AAA CCT GGG GCT 

PvuII Psil 

150 160 
Ser Val Lys He Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp His Ala He His Trp 
TCA GTG AAG ATT TCC TGC AAG GCT TCT GGC TAC ACC HC ACT GAC CAT GCA ATT CAC TGG 

170 180 
Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp He Gly Tyr Phe Ser Pro Gly Asn Asp 
GTG AAA CAG AAC CCT GAA CAG GGC CTG GAA TGG ATT GGA TAT TTT TCT CCC GGA AAT GAT 



FIG.19A 
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4-4-20 V|_/E17/CC49 Vh gene 

190 200 
A5D Phe Lvs Tvr Asn Glu Arg Phc Lys Gly Lys Ala Thr Leu Thr Ala Asp Lys Ser Ser 
GAT m AAA TAC AAT GAG AGG HC AAG GGC AAG GCC ACA CTG ACT GCA GAC AAA TCC TCC 

eiO 220 
Ser Thr Ala Tyr Val Gin Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr Phe Cys 
AGC ACT GCC JAC GTG CA6 C^C AAC A6C CTG ACA TCT GAG GAT TCT GCA GTG TAT TTC TGT 

230 240 
Thr Arn Ser Leu Asn M Ala Tyr Trp Gly Gin Gly Thr Ser Val Thr Val Ser x«x 
IgA TCC CT6 AAT AT6 GCC TAC TGG 66T CAA GGA ACC TCA GTC ACC GTC TCC TAA TAG 



Asp 
GAT CC 

FIG.19A(C0NT.) 
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CC49 V|_/217/4-4-20 gene 

CC49 Vl 10 20 

Asp Val Vol Met Ser Gin Ser Pro Ser Ser Leu Pro Va I Ser Va I Gly Glu Lys Val Thr 

GAC GTC GTG ATG TCA CAG TCT CCA TCC TCC CTA CCT GTG TCA 6TT 6GC GAG AAG GTT ACT 

30 40 

Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser Gly Asn Gin Lys Asn Tyr Leu Ala 

TTG AGC TGC AAG TCC AGT CAG AGO GTT TTA TAT AGT GGT AAT CAA AAG AAC TAC TTG GCC 

50 60 

Trp Tyr Gin Gin Lys Pro Gly Gin Ser Pro Lys Leu Leu lie Tyr Trp Ala Ser Ala Arg 

T66 TAC CAG CAG AAA CCA GGG CAG TCT CCT AAA CTG CTG ATT TAC T6G GCA TCC GCT AGG 

70 80 

Glu Ser Gly Vol Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Ser 

6AA TCT GGG GTC CCT GAT CGC TTC ACA G6C AGT GGA TCT GGG ACA GAT TTC ACT CTC TCC 

90 100 

He Ser Ser Val Lys Thr Glu Asp Leu Ala Val Tyr Tyr Cys Gin Gin Tyr Tyr Ser Tyr 

ATC AGC AGT GTG AAG ACT GAA GAC CTG GCA GTT TAT TAC TGT CAG CAG TAT TAT AGC TAT 

P 110 217 Linker 120 

Pro Leu Thr Phe Gly A lo Gly Thr Lys Leu Val Leu Lys Gly Ser Thr Ser Gly Lys Pro 

CCC CTC ACG TTC GGT GCT GGG ACC AAG CTT GTG CTG AAA 6GC TCT ACT TCC GGT AAA CCA 
. , Hind III 

^•^ 4-4-20 Vh 130 140 

Ser Glu Gly Lys Gly Glu Vol Lys Leu Asp Glu Thr Gly Gly Gly Leu Val Gin Pro Gly 

TCT GAA GCT AAA GGT GAA GTT AAA CTG GAT GAG ACT GGA GGA GGC TTG GTG CAA CCT GGG 

150 160 

Arg Pro Mei Lys Leu Ser Cys Val Ala Ser Gly Phe Thr Phe Ser Asp Tyr Trp Met Asn 

ABG CCC ATG AAA CTC TCC TGT GTT GCC TCT GGA TTC ACT Ul AGT GAC TAC TGG ATG AAC 

170 180 

Trp Vol Arg Gin Ser Pro Glu Lys Gly Leu Glu Trp Vol Ala Gin He Arg Asn Lys Pro 

TGC GTC CGC CAG TCT CCA GAG AAA GGA CTG GAG TGG GTA GCA CAA ATT AGA AAC AAA CCT 



FIG.19B 
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CC49 Vl/H17/4-4-20 gene EOO 
Tvr Asn Tyr Glu Thr Tyr Tyr Ser Asp Ser Val Lys Gly Arg Phe Thr lie Ser Arg Asp 
TAT AAT TAT GAA ACA TAT TAT TCA GAT TCT GTG AAA GGC A6A TTC ACC ATC TCA AGA GAT 

210 220 
AsD Ser Lys Ser Ser Val Tyr Leu Gin Met Asn Asn Leu Arg Val Glu Asp Met Gly He 
GAT TCC AAA ACT ACT GTC TAC CTG CAA ATG AAC AAC TTA AGA GTT GAA GAC ATG G6T ATC 

230 240 
Tvr Tvr Cvs Thr Gly Ser Tyr Tyr Gly Met Asp Tyr Trp Gly Gin Gly Thr Ser Val Thr 
TAT TAC T6T A^ G6T TCT TAC TAT GGT ATG GAC TAC TGG GGT CAA GGA ACC TCA GTC ACC 



Val Ser *x* Gly Ser 
GTC TCC TAA TAA GGA TCC 
Ban HI 
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PROCESSING RLE PolyCoiA/Proc.CC-49Prep 

METHOD: PREP POLY CAT A#2 

INJECT VOL 44 

SAMPUNG INT: 0.3 SECONDS 

CHROMATOGRAM: 




ANALYSIS: 

PEAK NO. 

1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 

TOTAL AREA 



CHANNEL A 

TIME 

17.090 

18.940 

21.775 

30.100 

33.455 

38.940 

42.010 

44.640 

57.055 

57.610 

58.240 



T^PE 

N1 

N2 

N3 

N4 

N5 

N6 

N7 

NB 

N9 

N10 

XII 



HQGHT(/iV) 

1651 
8014 
104401 
74925 
106864 
17296 
12645 
9287 
13767 
9323 
6824 



AREAOiV-SEC) 


AREA% 


348239 


0.778 


669441 


1.496 


8617252 


19.263 


9753616 


21.804 


15749605 


35.208 


2833701 


6.334 


1637917 


3.661 


1968584 


4.400 


2012338 


4.498 


210914 


0.471 


930855 


2.080 


44732462 


99.993 
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PROCESSING FILE: PolyCotA/Proc.CC-49Prep 
METHOD: CC-49 QC SIZE-EXCLUSION 
INJECT VOL: .05 
SAMPUNG INT: 0.1 SECONDS 

CHROMATOGRAM: 



CJ 

o 

CVJ 




T-i — I — I — n — r — I — I— T — I 

40.0 



ANALYSIS: CHANNEL A 

PEAK NO. TIME TYPE HEIGHT(/A/) AREA(mV-SEC) AREA% 

1 19 370 Nl 797 41706 5.694 

2 20 525 N2 11789 657280 89.737 

3 22*851 N3 1227 33466 4.569 
TOT/IAREA 7^2452 100.000 



FIG.22A 
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PROCESSING RLE: PolyCatA/Proc.CC-49Prep 
METHOD: CC-49 QC SIZE-EXCLUSION 
INJECT VOL .05 
SAMPUNG INT: 0.1 SECONDS 

CHROMATOGRAM: 




ANALYSIS: CHANNa A 

PEAK NO. TIME 

1 19.133 

2 20538 

TOTAL AREA 



TYPE HDGHT(mV) 

Nl 14211 
N2 1863 



AREA(mV-5EC) AREft% 

749671 88.214 

100154 11.785 

849825 99.999 



FIG.22B 
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PROCESSING RLE: PolyCatA/Proc.CC-49Prcp 
MHHOO: CC-49 QC SIZE-EXCLUSION 
IfUECT VOL .05 
SAMPLING m: 0.1 SECONDS 

Of^klATOGRAM: 



ro 




ANALYSIS: CHANNa 

PEAK NO. TIME 

1 19.163 

2 20.435 

TOTAL AREA 



TYPE HaGHT(/iV) 
N1 17550 
N2 2981 



AREAOiV-SEC) AREA% 

876502 83.039 

179029 16.961 

1055531 100.000 



FIG.22C 
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